Data Science / Machine Learning
PulsePoint’s award-winning platforms accelerate data and programmatic technology to deliver contextually relevant and personalized health information. We help brands and agencies better understand audience engagement and are revolutionizing health decisions through real time data.
As a member of our Data Science Engineering team, the Machine Learning Engineer, Natural Language Processing, will focus on the following: Improving page contextualizer technology: work with healthcare topics detection algorithms, keywords / phrases extraction, general and aspect-based sentiment analysis.
In addition to the above, they will work with the greater Data Science / Engineering teams on:
- Improving existing or developing new traffic segmentation algorithms and estimations of bid landscapes within each segment.
- Optimize real-time bidding strategies and auction mechanics to efficiently spend ad budgets delivering campaign targets given various constraints.
- Supporting and enhancing the existing work on health user profiling, prediction, and targeting tools.
- Contributing to projects relating to patient/physician identity for cross-device tracking, profiling and targeting.
- Supporting existing codebase for data integration and production support for our core models.
These are the things that we'll be looking for from a candidate:
- 3+ years of NLP or relevant contextualization experience.
- Advanced knowledge of Python using Numpy & Pandas.
- Being able to optimize and speed up code.
In addition to the above, we'd like you to have exposure to:
- Sorting, search tree, binary heap, trie; Time & mem complexity of algorithms.
- Probability & Statistics-Markov processes and its stationary distributions; Stochastic matrix and properties of its eigenvalues; Bayesian inference and conjugate distributions; Two-sample hypothesis testing.
- Dimensionality reduction; Geometry of PCA and SVD; Geometry of L1 and L2 regularization (Why does L1 result in feature selection?); Decision Trees; Collaborative filtering; Thompson sampling; MCMC; Boosting, (Biases in Boosted DT); Bagging.
- Embeddings; Encoders; Drop-out; CNN, RNN; Internal covariate shift.
Петр Кузин Tech Recruiter