Implementing machine learning in the real world isn’t easy. The tools are available and the road is well-marked—but the speed bumps are many.
That was the conclusion of panelists wrapping up a day of discussions at the IEEE AI Symposium 2019, held at Cisco’s San Jose, Calif., campus last week.
It’s tough to find data scientist expertise, he indicated, so companies are looking into non-traditional sources of personnel, like political science. “There are some untapped areas with a lot of untapped data science expertise,” Irving says.
Lazard’s artificial intelligence manager Trevor Mottl agreed that would-be data scientists don’t need formal training or experience to break into the field. “This field is changing really rapidly,” he says. “There are new language models coming out every month, and new tools, so [anyone should] expect to not know everything. Experiment, try out new tools and techniques, read, study, spend time; there aren’t any true experts at this point because the foundational elements are shifting so rapidly.”
“It is a wonderful time to get into a field,” he reasons, noting that it doesn’t take long to catch up because there aren’t 20 years of history.”
Confusion about what different kinds of machine learning specialists do doesn’t help the personnel situation. An audience member asked panelists to explain the difference between data scientist, data analyst, and data engineer. Darrin Johnson, Nvidia global director of technical marketing for enterprise, admitted it’s hard to sort out, and any two companies could define the positions differently. “Sometimes,” he says, particularly at smaller companies, “a data scientist plays all three roles. But as companies grow, there are different groups that ingest data, clean data, and use data. At some companies, training and inference are separate. It really depends, which is a challenge when you are trying to hire someone.”
Mitigating the risks of a hot job market
The competition to hire data scientists, analysts, engineers, or whatever companies call them requires that managers make sure any work being done is structured and comprehensible at all times, the panelists cautioned.
“We need to remember that our data scientists go home every day and sometimes they don’t come back because they go home and then go to a different company,” says Lazard’s Mottl. “That’s a fact of life. If you give people choice on [how they do development], and have a successful person who gets poached by competitor, you have to either hire a team to unwrap what that person built or jettison their work and rebuild it.”
By contrast, he says, “places that have structured coding and structured commits and organized constructions of software have done very well.”
But keeping all of a company’s engineers working with the same languages and on the same development paths is not easy to do in a field that moves as fast as machine learning. Zongjie Diao, Cisco director of product management for machine learning, quipped: “I have a data scientist friend who says the speed at which he changes girlfriends is less than speed at which he changes languages.”
The data scientist/IT manager clash
Once a company finds the data engineers and scientists they need and get them started on the task of applying machine learning to that company’s operations, one of the first obstacles they face just might be the company’s IT department, the panelists suggested.
“IT is process oriented,” Mottl says. The IT team “knows how to keep data secure, to set up servers. But when you bring in a data science team, they want sandboxes, they want freedom, they want to explore and play.”
Also, Nvidia’s Johnson pointed out, “There is a language barrier.” The AI world, he says, is very different from networking or storage, and data scientists find it hard to articulate their requirements to IT.
On the ground or in the cloud?
And then there is the decision of where exactly machine learning should happen—on site, or in the cloud? At Lazard, Mottl says, the deep learning engineers do their experimentation on premises; that’s their sandbox. “But when we deploy, we deploy in the cloud,” he says.
Nvidia, Johnson says, thinks the opposite approach is better. We see the cloud as “the sandbox,” he says. “So you can run as many experiments as possible, fail fast, and learn faster.”
For Cisco’s Irving, the “where” of machine learning depends on the confidentiality of the data.
Mottl, who says rolling machine learning technology into operation can hit resistance from all across the company, had one last word of caution for those aiming to implement AI:
Data scientists are building things that might change the ways other people in the organization work, like sales and even knowledge workers. [You need to] think about the internal stakeholders and prepare them, because the last thing you want to do is to create a valuable new thing that nobody likes and people take potshots against.
A version of this post appears in the November 2019 print magazine as “For AI Rollouts, Hazards Reported Ahead.”