Tagged | big-data
-
Then and Now: The Rethinking of Time Series Data at Wayfair
(tech.wayfair.com) -
Introducing Semantic Experiences with Talk to Books and Semantris
(research.googleblog.com) -
Simon Moss on using artificial intelligence to fight financial crimes
(www.oreilly.com) -
Give Meaning to 100 billion Events a Day - The Analytics Pipeline at Teads
(highscalability.com) -
Scaling Uber’s Hadoop Distributed File System for Growth
(eng.uber.com) -
Extracting Signals From the News
(eng.datafox.com) -
A brief introduction to two data processing architectures — Lambda and Kappa for Big Data
(towardsdatascience.com) -
Search Federation Architecture at LinkedIn
(engineering.linkedin.com) -
The Evolution of Data at Reddit
(redditblog.com) -
Data Analysis with Spark
(jobs.zalando.com) -
Under the hood: Suicide prevention tools powered by AI
(code.facebook.com) -
A Cornucopia of Area Rugs: Will a Diverse Set of Choices Help Customers Find More of What They Love?
(tech.wayfair.com) -
How to hack Spark to do some data lineage
(blog.octo.com) -
Creating a musical (data) pipeline
(devblog.songkick.com) -
Mis-employing radar charts to distinguish multidimensional data
(towardsdatascience.com) -
How to add full text search to your website
(medium.com) -
Cross-Lingual End-to-End Product Search with Deep Learning
(jobs.zalando.com) -
Dynamometer: Scale Testing HDFS on Minimal Hardware with Maximum Fidelity
(engineering.linkedin.com) -
From big data to fast data
(www.oreilly.com) -
Using Synthetic Data Modeling to Enhance Machine Learning
(engineering.salesforce.com) -
Caviar’s Word2Vec Tagging For Menu Item Recommendations
(medium.com) -
Time Series Forecasting with Splunk. Part I. Intro & Kalman Filter.
(towardsdatascience.com) -
Scaling Time Series Data Storage — Part I
(medium.com) -
PageRank in Spark
(developers.soundcloud.com) -
Omphalos, Uber’s Parallel and Language-Extensible Time Series Backtesting Tool
(eng.uber.com) -
Fishing for graphs in a Hadoop data lake
(www.oreilly.com) -
Mapping Medium’s Tags
(medium.engineering) -
Faster E-commerce Search
(www.ebayinc.com) -
The frequency of tags on Stack Overflow
(towardsdatascience.com) -
Evolving search recommendations on Pinterest
(medium.com) -
The Art of Effective Visualization of Multi-dimensional Data
(towardsdatascience.com) -
Bad Design Is Bad for Your Health: Why Data Visualization Details Matter
(engineering.cerner.com) -
Big Data: Information visualization techniques
(towardsdatascience.com) -
Out of Core Genomics
(towardsdatascience.com) -
Large-Scale Health Data Analytics with OHDSI
(blog.cloudera.com) -
How machine learning will accelerate data management systems
(www.oreilly.com) -
Hadoop Delegation Tokens Explained
(blog.cloudera.com) -
DeepVariant: Highly Accurate Genomes With Deep Neural Networks
(research.googleblog.com) -
[Episode 01] Airbnb, Machine Learning & the Future of Travel
(mesosphere.com) -
Incremental Data Capture for Oracle Databases at LinkedIn: Then and Now
(engineering.linkedin.com) -
Dali Views: Functions as a Service for Big Data
(engineering.linkedin.com) -
Rebuilding the Segment Leaderboards Infrastructure — Part 3: Design of the New System
(medium.com)#stream-processing #apache-kafka #big-data #backend #cassandra
-
The Global Heatmap, Now 6x Hotter
(medium.com) -
Big Data Processing at Spotify: The Road to Scio (Part 1)
(labs.spotify.com) -
Airflow: The Missing Context
(hackernoon.com) -
Using Kafka Streams API for predictive budgeting
(medium.com) -
Big Dataset: All Reddit Comments – Analyzing with ClickHouse
(www.percona.com) -
One Million Tables in MySQL 8.0
(www.percona.com) -
The Search for Better Search at Reddit - Because, certainly, we’ve solved it this time
(redditblog.com) -
Exploring and Visualizing an Open Global Dataset
(research.googleblog.com) -
Steering oceans of content to the world
(code.facebook.com) -
IMDb Data in a Graph Database
(www.percona.com) -
Implementing Temporal Graphs with Apache TinkerPop and HGraphDB
(blog.cloudera.com) -
Breaking the “curse of dimensionality” in Genomics using “wide” Random Forests
(databricks.com) -
Building the Activity Graph, Part 2
(engineering.linkedin.com) -
BigDB - an ad data pipeline for LINE
(engineering.linecorp.com) -
Engineering Data Analytics with Presto and Parquet at Uber
(eng.uber.com)