Data are becoming the new raw material of business
The Economist

Data Science in 30 Minutes: Accelerating Data Science Workflows with Bartley Richardson

 

 

 

 

 

GPUs built on CUDA have been used for deep-learning and other applications for a long time. But, when you look at data scientists and they work they’re doing, CUDA doesn’t really fit well into their workflow. Today’s scientists want quick exploration, quick results and to be able to shift gears without interrupting their train of thought. They want to think at the speed of data.

In this webinar, Bartley Richardson, PhD, a former fellow of The Data Incubator and a senior data scientist at Nvidia, addresses this issue. Richardson shares Nvidia RAPIDS project, an open-source suite of data processing and machine learning libraries that enables GPU acceleration for data science workflows. It also delivers a 50- to 100-times improvement over traditional GPU processing, but still using the same code and following the APIs that data scientists are familiar with (e.g., Pandas, SciKit).

 


Data Science in 30 Minutes: Understanding Principal Component Analysis Using Stack Overflow Data with Julia Silge

This FREE webinar will take place LIVE online on April 25th at 5:30PM ET. Register below now, space is limited!

Join The Data Incubator and Julia Silge, Data Scientist at Stack Overflow, for the next installment of our free online webinar series, Data Science in 30 Minutes: Understanding Principal Component Analysis Using Stack Overflow Data.

Principal component analysis (PCA) is a powerful approach for exploring high-dimensional data, but like most machine learning algorithms can be challenging for learners to understand. In this webinar, we will walk through a practical and interactive explanation of what PCA is and how it works. As a case study we’ll explore a domain that many who work with data are familiar with: programming languages and technologies, as measured by traffic to Stack Overflow questions. We will explore how interactive visualization gives us insight into the complex, real-world relationships in high-dimensional datasets. We will also discuss how data is used at Stack Overflow and life as a data scientist there.
Continue reading


Data Science in 30 Minutes: Data Privacy and Big Data Ethics with @data_nerd, Carla Gentry

This FREE webinar will take place LIVE online on March 21st at 5:30PM ET. Register below now, space is limited!

Join The Data Incubator and Carla Gentry, data science expert and influencer, for the next installment of our free online webinar series, Data Science in 30 Minutes: Data Privacy and Big Data Ethics.

Ethics and transparency have to go hand in hand with data scientist and business – remember before you make promises you can’t keep, machine learning, AL, NLP, etc… all require good data, communication within the team creating or designing, system compatibility, solid logical programming and MATH… It’s not just a cool buzzword and something to add to your resume or website to be deemed relevant. Bias, whether implied or intentional, affects lives, knowledge of data is important now more than ever.

 

Continue reading


Data Science in 30 Minutes: Infonomics, The New Economics of Information with Gartner’s Doug Laney

This FREE webinar will take place LIVE online on February 20th at 5:30PM ET. Register below now, space is limited!

Join The Data Incubator and Doug Laney, Senior Analyst and Advisor with Gartner‘s Chief Data Officer research group for the next installment of our free online webinar series, Data Science in 30 Minutes: Infonomics, The New Economics of Information.

Doug will share an overview of his research on information value and highlights from his new book, “Infonomics: How to Monetize, Manage, and Measure Information for Competitive Advantage.” We will explore the origins of this concept, along with why and how organizations should treat information as an actual corporate asset. We will also discuss the specifics of how data and analytics leaders such as chief data officers (CDOs), chief data scientists, enterprise architects, CIOs, and even CFOs can understand and take advantage of information’s unique economic properties to help transform their organizations. Doug will share his methods for applying asset management best-practices to information, and how to monetize information, including real-world examples of how companies and government agencies have monetized their (and others’!) information. And we will conclude with Gartner’s information valuation models and how some organizations have identified and generated millions of dollars of value by applying them.
Continue reading


Data Science in 30 Minutes: Examining Machine Learning Trends with Cloudera Research Engineer, Shioulin Sam

This FREE webinar will take place LIVE online on January 23rd at 5:30PM ET. Register below now, space is limited!

Join The Data Incubator and Shioulin Sam, Research Scientist at Cloudera Fast Forward Labs for the next installment of our free online webinar series, Data Science in 30 Minutes: Examining Machine Learning Trends

We will explore the latest and greatest in machine learning, including (but not limited to) semantic recommendations and multi-task learning. In regard to semantic recommendations, we will discuss how multi-modal embeddings – an emerging technique from deep learning – enable us to build a better system that actually understands content. We will also look at how multi-task learning – an approach in which models are trained to learn related tasks in parallel – is central to the notion of Software 2.0, and helps computers learn more the way we do. We will showcase both capabilities with a live demo of our prototypes.
Continue reading


Data Science in 30 Minutes: Uber’s Chief Scientist Explores Frontiers of Machine Learning and AI

This FREE webinar will take place LIVE online on December 19th at 5:30PM ET. Register below now, space is limited!

Join The Data Incubator and Zoubin Ghahramani, Chief Scientist for Uber, for the December 2018 installment of our free monthly webinar series, Data Science in 30 minutes: Uber’s Chief Scientist Explores Frontiers of Machine Learning and AI.

Zoubin will review fundamental concepts and recent advances in artificial intelligence. He will then highlight some areas of research at the frontiers, touching on topics such as deep learning, probabilistic programming, Bayesian optimisation, and AI for data science. Finally, he will describe how these areas fit into Uber’s mission.
Continue reading


Data Science in 30 Minutes: Using Data Science to Predict the Future with Kirk Borne

This FREE webinar will take place LIVE online on October 25th at 5:30PM ET. Register below now, space is limited!

Join The Data Incubator and Kirk Borne, Principal Data Scientist for Booz Allen Hamilton, for the October 2018 installment of our free monthly webinar series, Data Science in 30 minutes: Using Data Science to Predict the Future.

Predictive Analytics is currently one of the most significant and ubiquitous applications of Machine Learning in organizations. It is a major topic in business Data Strategy, Analytics Strategy, and Machine Learning Strategy discussions. This presentation will focus on new approaches to forecasting outcomes (predictive analytics) and to go even further: optimization of outcomes (prescriptive analytics). Specifically, we will invoke some common techniques and exploit them in new ways to “see around corners” with data.
Continue reading


Data Science in 30 Minutes: Holden Karau – A Quick Introduction to PySpark


IBM‘s Holden Karau joined  The Data Incubator in June 2017 and for our free online webinar series, Data Science in 30 minutes – Sign up below for the full video!

Holden Karau presented a super fast introduction to PySpark – how to use Python and Spark together when you exceed the limitations of a single machine. Apache Spark is a fast and general engine for distributed computing & big data processing with APIs in Scala, Java, Python, and R. This tutorial will briefly introduce PySpark (the Python API for Spark) with some hands-on-exercises combined with a quick introduction to Spark’s core concepts. We will cover the obligatory wordcount example which comes in with every big-data tutorial, as well as discuss Spark’s unique methods for handling node failure and other relevant internals.

Continue reading


Data Science in 30 Minutes: Deep Learning to Detect Fake News with Uber ATG Head of Data Science, Mike Tamir

This FREE webinar will take place LIVE online on August 21st at 5:30PM ET. Register below now, space is limited!


Join The Data Incubator and Mike Tamir, Head of Data Science for Uber Advanced Technologies Group, for the August 2018 installment of our free monthly webinar series, Data Science in 30 minutes: Deep Learning to Detect Fake News.

Mike will discuss how he created FakerFact.org, an Artificial Intelligence tool that enables readers to detect when an article is focused on credible information sharing vs. when the focus is on manipulation. We will explore real world use case applications for automated “Fake News” evaluation using contemporary deep learning article vectorization and tagging. We begin with the use case and an evaluation of the appropriate context applications for various deep learning applications in fake news evaluation. We will discuss several methodologies for article vectorization with classification pipelines, ranging from traditional to advanced neural network deep architecture techniques. We close with a discussion on troubleshooting and performance optimization when consolidating and evaluating these various techniques on active data sets.
Continue reading


Data Science in 30 Minutes: The Accidental Data Scientist with Katrina Riehl, Director of Data Science for HomeAway.com

This FREE webinar will take place LIVE online on July 24th at 5:30PM ET. Register below now, space is limited!


Join The Data Incubator and Katrina Riehl, Director of Data Science for HomeAway.com, for the July 2018 installment of our free monthly webinar series, Data Science in 30 minutes: The Accidental Data Scientist.

Katrina will detail the journey her career has taken from researcher and software developer to Data Scientist. She will explain how her technology roles and skills have evolved as this new discipline emerged over the last decade. First, starting out as a young Python and Artificial Intelligence enthusiast and eventually after many years, finally embracing Data Science as a discipline, and leading a strong and diverse Data Science team.
Continue reading