Senior Data ScientistSenior Data Scientist
Our client is a US – based leading Analytics company whose platform (evolving since 2007) offers large enterprises around the globe services and technology to generate insights about their customer’s and target audience’s mindsets; attitudes and opinions about their brands and products as well as their competitors. The platform identifies social and business tendencies across all social media and internet counting on an extremely robust data warehouse containing 90 billion + posts.
The company, conscious that the Big-Data and Data Science/Data engineering technologies behind such services are evolving continually and new approaches and ideas are being developed to make these services ever more powerful, has chosen Barcelona to make sure the company remains at the forefront of Social Media Analytics and is creating a NEW team to build on a modern, open-source technology stack (HDFS, ElasticSearch, Spark) on top of which this Data Science team will be developing cutting-edge analytical algorithms which extract meaning from predominantly text data in multiple languages and delivers data product prototypes. The company uses everything in the data science toolkit (rule-based approaches, machine-learning, data visualization) and are constantly called upon to find innovative solutions to new problems.
As an ultimate goal, they are about teaching the robots how to read, understand, process and summarise, as a human being would do.
This is an extraordinary opportunity for Data Scientists/Engineers and Big Data Engineers to apply their knowledge and skills to make a tangible ‘commercial’ impact on and contribute to this cutting-edge technology
The main responsibilities for this role include:
- Design, build and test scalable text analytics algorithms to extract insights from large amounts of social media and online conversation data (tweets, blogs, message boards etc…)
- Provide thought leadership with respect to the applications of data science methods and principles to customer facing projects
- Partner closely with product development and engineering to create net new features of our analytics software product, leveraging state-of-the-art methods in Natural Language Processing and Machine Learning
- Monitor advances in the areas of NLP, machine learning and text and image analysis in order to select the best techniques for our platform.
- Participate in the wider NLP/ML community to maintain a two-way collaboration with regards to the latest trends and techniques in the field.
- Educational background in Computer Science, Statistics, Math, Computational Linguistics or related quantitative field
- Minimum Master’s degree, Ph.D preferred
- Expertise in the use of statistical NLP methods such as probabilistic topic modeling, neural word embeddings, dependency parsing
- Solid understanding of modern Machine Learning techniques, including neural nets and ensemble methods.
- Minimum of 3 years experience working on software solutions to natural language processing problems
- Python (gensim, pandas, scikit-learn, spaCy)
- Understanding of data cleansing, preparation, modeling (specifically with text)
- Statistical NLP methods (topic modeling, word embeddings, dependency parsing)
- Predictive modeling techniques (logistic regression, SVM, decision trees, naive bayes)
- Dimensionality reduction and clustering techniques
- Familiarity with common data visualization platforms, for example, ggplot, matplotlib, Tableau, plotly, bokeh, etc.
- Familiarity with common Big Data technologies, especially Apache Spark (also, Elasticsearch, HBase, etc…)