Archives for : April2014

Why building a data science team is deceptively hard

More and more startups are looking to hire data scientists who can work autonomously to derive valuable insights from data. In principle, this sounds great: engineers and designers build the product, while data scientists crunch the numbers to gain insights. In practice, finding these data scientists and enabling them to be productive are very challenging tasks.

Before diving further, it’s useful to note a few trends in data and product development that have emerged over the past decade:
Companies such as Google, Amazon and Netflix have shown that proper storage and analysis of data can lead to tremendous competitive advantages.

It’s now feasible for startups to instrument and collect vast amounts of usage data. Mobile is ubiquitous, and apps are constantly emitting data. Big data infrastructure has matured, which means large-scale data storage and analysis are affordable.

The widely adopted lean startup philosophy has shifted product development to be much more data-driven. Startups now face the challenges of defining Key Performance Indicators (KPI), designing and implementing A/B tests, understanding growth and engagement funnel conversion, building machine learning models, etc.

Because of these trends, startups are eager to develop in-house data science capabilities. Unfortunately many of them have the wrong ideas about how to build such a team. Let me describe three popular misconceptions.

Misconception #1: It’s okay to compromise the engineering bar for statistical skills.

For data scientists to work productively and independently, they must be able to navigate the entire technical stack and work effectively with existing systems to extract relevant data. The only exception is if a startup has already built out its data infrastructure. But in reality, very few startups have their infrastructure in place before building a data science team. In these cases, a data science team without strong engineering skills or engineering support will have a hard time doing their job. At best, they will produce suboptimal solution that will be rewritten by another team for production.

To illustrate this, take the example of building the KPI dashboard at Codecademy. Before visualizing the data in d3, I had to extract and join (a) user data from MongoDB, (b) cohort data from Redis, and (c) pageview data from Google Analytics. The data collection alone would’ve been near impossible without an engineering background, let alone making the dashboard real-time, modularized and reusable.

Misconception #2: It’s okay to compromise the statistics bar for engineering skills.

Proper interpretation of data is not easy, and misinterpreted data can do more damage than data that’s not interpreted at all (check out “Statistics done wrong”). Building useful machine learning (ML) models is trickier than most people expect. A popular but misguided view holds that ML problems can be solved either by applying some black box algorithm (a.k.a magic), or by hiring interns who are PhD students. In practice, hundreds of decisions and tradeoffs are made in solving such problems, and knowing which decisions to make requires a lot of experience. (I’ll expand upon this more in a future post titled “Machine learning done wrong”.) For a given ML problem, there are tens if not hundreds of way to solve it. Each solution makes different assumptions and it’s not obvious how to navigate and identify which assumptions are reasonable and which model should be used. Some would argue: why not just try all different approaches, cross validate, and see which one works the best? In reality, you never have the bandwidth to do so. A strong data scientist might be able to produce a sensible model right off the bat, while a weak one might spend months optimizing the wrong model without knowing where the problem is.

Misconception #3: It’s okay to hire data scientists who lack product thinking.

Imagine asking someone who doesn’t have a holistic view of the product to optimize your business KPIs. They may prematurely optimize the sign up funnel before making sure the product has reasonable retention, which would lead to more unretained users. Some think data-driven product development is a local optimization. This criticism is only correct when those who drive product development with data fail to think about the product holistically.

To sum up, a productive data science team requires data scientists that are strong in engineering, statistics, and product thinking. It’s hard. And it becomes even harder to look for the first data science hire who will be spearheading data efforts in a startup. For startups that don’t have the luxury to wait and hire these rare data scientists, it’s important to be aware of the compromises made especially in terms of the hiring bar. Before the data team is strong enough across all three areas, make sure they have strong support for the skills they lack, and don’t expect them to work autonomously.

If you like the post, you can follow me (@chengtao_chu) on Twitter or subscribe to “ML in the Valley”. Also, special thanks Ian Wong (@ihat), Leo Polovets, and Bob Ren (@bobrenjc93) for reading a draft of this.

Codecademy Reimagined

We’ve been working hard over the past four months trying to reimagine Codecademy and we couldn’t be happier to finally unveil it to the world. We have redefined every component under our brand, from a single button on our dashboard to our email template, business cards, slides and even apparel.

We had been discussing a design refresh for a while, but somehow it always ended up being pushed to the side. Finally, in October last year, after completing a user segmentation project that brought to live the main user archetypes of Codecademy.com, it quickly became apparent that if we wanted to grow and mature as a brand, we required a thorough redesign of our entire product.


Why a redesign?

Reason #1 – Start fresh

First, there was the obvious problem of design incoherence and variation. This happened primarily because we lacked a well-defined color and font palette, a uniform visual language for our badges, a unified layout scheme (page types), and a cohesive strategy for all print materials – business cards, postcards, posters, etc. After two and a half years of multiple nip and tuck design fixes and additions, it was time to clean up the house and start fresh. This meant we could finally create an extensible UI pattern library (used and shared by designers and developers) and optimize our new face across multiple platforms by embracing a responsive design layout.

Some of our old webpages
A random sampling of pages within our old web ecosystem, showcasing some visual design inconsistency.

Reason #2 – Brand matureness

Secondly, was the realization that our young startup look and feel was slowly becoming incompatible with our future goals and aspirations. In a time when we are engaged in several partnerships with schools, companies, and governments across the globe, while also continuing to fulfill the needs of our growing user base, our brand should feel a bit more mature, inviting, professional, and sophisticated.

Our old logo
Codecademy’s quirky and undistinguished old logo was created by one of our co-founders in a few minutes by browsing through various fonts in a word processor. The logo featured the giddy lobster font, which has become so popular that is at times compared with Microsoft’s Comic Sans.


Our new look

Phase 1

Back in early November last year, we partnered with our friend Eddie Opara, and his immensely talented team of designers at Pentagram, in order to create a new visual identity that could better reflect the company’s age, ambition, and main attributes.

The first thing we tackled was the logo, as the key centerpiece of our new look. We spent some time talking to our users, colleagues, and our founders Zach and Ryan, to have a solid grasp of Codecademy’s perception and future aspirations. After this important research period, we went through several revisions, continuously narrowing down on the mark that best represented our main traits.

While putting the finishing touches to our new logo, we began creating a complementary color, font and iconography palette. It was important to handle all these components simultaneously, so we could delineate a consistent design thread through all of them. Phase 1 gave us the most critical building blocks of our new brand, through our partnership with Pentagram, and marked the beginning of an exclusive in-house development period.

Early directions of our logo
Various early directions for our new logo.

narrowing down on the logo
Narrowing down on a few favorite visual marks.

our final logo
Our final logo with its underlying grid.

our new graphical language
Our new graphical language used across the site to indicate different types of content (symbols), actions and controls (icons), and learning achievements (badges).

our new font
FF DIN Rounded – Our primary typeface.

our new color palette
Our new color palette.

Phase 2

After defining the main brand pieces with Pentagram (logo, typography, iconography, color), we started applying it internally to our entire web ecosystem by building a comprehensive number of reusable design patterns. For two weeks we built a sizable UI toolkit covering a variety of elements (see below).

our first UI toolkit
Our first attempt at the UI toolkit encompassing only a short amount of elements.

our final toolkit
Our growing UI toolkit covering every element, such as header, footer, form fields, button styles, sign up modules, grid, padding, typography, colors, and interactions.

Phase 3

This was the longest, and perhaps most exhausting, of all phases, where we redesigned 70+ webpages in tandem with other collateral material (email templates, slides, apparel, etc). Fortunately, we imposed ourselves a very well defined timeline, with multiple cycles and milestones, which helped us guide through this large task (see sitemap below).

sitemap
First, we created a comprehensive sitemap of Codecademy.com and then divided the sitemap into four groups, each representing a 2-week delivery cycle. As we redesigned the various pages in each cycle, our brilliant team of developers built our UI styleguide and constructed many of the pages based on the shared design patterns.

examples 1
Examples of our redesigned pages, from left to right: Enterprise, Stories, Profile.

examples 2
Examples of our redesigned pages, from left to right: Blog, About, Jobs.

examples 3
Examples of our redesigned pages, from left to right: Help Center, UK Curriculum, Hour of Code.

Phase 4

Our final phase was all about making sure we were building the thing right. We implemented and tested our new redesign, while in the process getting feedback from our community. We created a huge amount of redlines for all the new material, started experimenting with some versions live on the site, and listened to dozens of comments from our selected users and moderators.

redlines
An example of the various comps created to support the accurate implementation of all our redesigned pages.


The beginning

Even though we spent a long time rethinking Codecademy, this hefty work is still unfinished. It certainly provides our team and product with a much-needed fresh face, one that we can feel proud of, and most importantly, one that our users can thoroughly enjoy. But this is just the beginning. We would love to hear what you have to say about our redesign and how we can continue improving our product. We have dozens of ideas to continue pushing this brand foreword. Please keep coming back for more!

We’re Learning Too: A New Codecademy

Two years ago, we started building a product that would help teach people the skills they needed to succeed in a digital world. As more than 24 million people took Codecademy courses on our web and iOS platforms, we too learned and grew. Now, we’re excited to show you our latest project — a new Codecademy designed from the ground up, aimed to help you learn skills hands-on, with real projects, and constant feedback. Better yet, the new Codecademy experience helps to connect you with the real skills you’ll need to succeed in today’s workplace.

Logo

Learning isn’t just about one exercise or “class,” but instead is a gateway to community, opportunities, ideas, and a better life. We’ve witnessed this through the millions of learners on Codecademy and through the thousands of inspiring teachers who have shared their knowledge with the world with our course creator. We listened to them while building what we think is the best learning experience — for anyone, anywhere — to learn the most important skills of today.

In two years, Codecademy has scaled to become larger than we had ever imagined. Our learners, spread across the globe in every country in the world, have:

Today, we’re proud to show off the results of all of that to a few friends and, within days, the rest of the world. The first fruits of this effort are an experience that gets you from knowing nothing to building a website — in this case, Airbnb’s homepage. Along the way, you’ll experiment with blocks of code, see the results of adding and subtracting different parts of a page, and use the real terminology that developers and designers all over the world over use to create websites just like Airbnb’s.

Build a professional website

Our new platform leaves you not just with new knowledge, but with a portfolio of projects you can share with your friends, enabling them to learn from you. We’ve even built the capability for you to share your work with future employers, and to demonstrate your new skills. We’ve been testing our new learning interface for weeks and we’ve seen it applied in an amazing number of ways — from designers at major firms winning new consulting work because of their ability to build their designs to students in high school making personal webpages for themselves.

Learning environment

Codecademy’s learning experience comes not just from the data behind 24 million learners and billions of lines of code, but also from the individual stories we’ve heard from our wealth of committed learners. Former book critic Juliet Waters, for instance, started learning with her 11-year old son as part of our Code Year program in 2012. Since then, she’s gone on to chronicle her journey in a book that’s coming soon, noting that programming helped her feel “more connected with others in our tech-driven society.” A parent named Shari told us that her 11 and 13-year old sons had a “reaction to what they are learning [that] beats their enthusiasm for the .” We work hard everyday to deliver a similar experience for our users all around the world — with more than 65% of users outside the US, it’s important to us that we’re democratizing access to the fundamental blocks of knowledge that can improve peoples’ lives.

Panel

Tommy Nicholas’ story is just the sort we’re hoping our new learning environment will foster: he began with almost no programming knowledge at all, and gained enough skills to develop a website, Coffitivity, that was named one of TIME Magazine’s Top 50 websites in 2013.

Billions of lines of code, millions of users, and years after our founding, we’ve been astonished by what people can do when they can easily learn the fundamental skills that can transform their lives. Today, we’ve redesigned Codecademy to reflect that potential — and hopefully to help more people reach their goals and build the future they want to live in.

If you want to help us, we’re hiring!