-
Notifications
You must be signed in to change notification settings - Fork 706
Powered By
Neville Li edited this page Nov 14, 2016
·
37 revisions
Want to be added to this page? Send a tweet to @scalding or open an issue.
Company | Scalding Use Case | Code |
---|---|---|
We use Scalding often, for everything from custom ad targeting algorithms, market insight, click prediction, traffic quality to PageRank on the Twitter graph. We hope you will use it too! | - | |
Spotify | We use Scalding for almost everything including music recommendation features like Discover Weekly & Release Radar, key business metrics, analytics and content catalogue. | - |
Etsy | We're starting to use Scalding alongside the JRuby Cascading stack described here. More to come as we use it further. | - |
eBay | We use Scalding in our Search organization for ad-hoc data analysis jobs as well as more mature data pipelines that feed our production systems. | - |
Snowplow Analytics | Our data validation & enrichment process for event analytics is built on top of Scalding. | GitHub |
PredictionIO | Machine-learning algorithms build on top of Scalding. | GitHub |
Gatling | We've just rebuilt our reports generation module on top of Scalding. Handy API on top of an efficient engine. | GitHub |
SoundCloud | We use Scalding in our search and recommendations production pipelines to pre and post-process data for various machine learning and graph-based learning algorithms. We also use Scalding for ad-hoc and regular jobs run over production logs for things like click tracking and quality evaluation on search results and recommendations. | - |
Sonar | Our platform is built on Hadoop, Scalding, Cassandra and Storm. See Sonar's job listings. | - |
BSkyB | Sky is using Scalding on Hadoop and utilizing HBase through the SpyGlass library for statistical analysis , content related jobs and reporting. | - |
LivePerson | LivePerson's data science group is using Scalding on Hadoop, to develop machine learning algorithms and big data analysis. | - |
Sharethrough | Sharethrough uses Scalding throughout our production data infrastructure. We use it for everything from advertiser reporting and ML feature engineering, to ad targeting and click forecasting. | - |
Scalding is being used at LinkedIn both at the Product Data Science team and the Email Experience team. | - | |
Stripe | Stripe uses Scalding for ETL and machine learning to support our analytics and fraud prevention teams. | - |
Move | Move uses Scalding on Hadoop for advanced analytics and personalization for Realtor.com and its mobile real estate apps. | - |
Tapad | Tapad uses scalding to manage productized analytics and reporting, internal ad-hoc data mining, and to support our data science team's research and development efforts. | - |
CrowdStrike | CrowdStrike employs Scalding in our data science and data mining pipelines as part of our big data security platforms in research, development, product and customer endpoints. We have plans to open source our Scalding API (AWS, EMR) on github. | - |
Tumblr | Tumblr uses scalding as a sort of MVC framework for Hadoop. Applications include recommendations/discovery, spam detection, and general ETL. | - |
Elance | Elance uses scalding for constructing data sets for search ranking, recommendation systems, other modeling problems. | - |
Commonwealth Bank Of Australia | Commbank uses scalding as a key component within its big data infrastructure. Both on the ETL side, and for the implementation of data science pipelines for building various predictive models | Github |
Sabre Labs | Sabre Labs uses Scalding for ETL and ad hoc data analysis of trip information. | |
gutefrage.net | gutefrage.net uses Scalding for it's Data Products and general ETL flows. | |
MediaMath | MediaMath uses Scalding to power its Data Platform, the centralized data store that powers our ad hoc analytics, client log delivery and new optimization/insight-based products. | |
The Search Party | The Search Party is using Scalding to build production machine learning libraries for clustering, recommendation and text analysis of recruitment related data. Scalding is a breath of fresh air! | |
Opower | Opower uses Scalding and KijiExpress to analyze the world's energy data and extract machine learning-based insights that power behavior change. | |
Barclays | Barclays uses Scalding for Data Warehousing, ETL and data tranformation into columnar (query optimized) data formats. | |
Devsisters | Devsisters uses Scalding for game log analysis (1264) |
- Scaladocs
- Getting Started
- Type-safe API Reference
- SQL to Scalding
- Building Bigger Platforms With Scalding
- Scalding Sources
- Scalding-Commons
- Rosetta Code
- Fields-based API Reference (deprecated)
- Scalding: Powerful & Concise MapReduce Programming
- Scalding lecture for UC Berkeley's Analyzing Big Data with Twitter class
- Scalding REPL with Eclipse Scala Worksheets
- Scalding with CDH3U2 in a Maven project
- Running your Scalding jobs in Eclipse
- Running your Scalding jobs in IDEA intellij
- Running Scalding jobs on EMR
- Running Scalding with HBase support: Scalding HBase wiki
- Using the distributed cache
- Unit Testing Scalding Jobs
- TDD for Scalding
- Using counters
- Scalding for the impatient
- Movie Recommendations and more in MapReduce and Scalding
- Generating Recommendations with MapReduce and Scalding
- Poker collusion detection with Mahout and Scalding
- Portfolio Management in Scalding
- Find the Fastest Growing County in US, 1969-2011, using Scalding
- Mod-4 matrix arithmetic with Scalding and Algebird
- Dean Wampler's Scalding Workshop
- Typesafe's Activator for Scalding