Full Time
USA Only (Remote)
Posted on March 4, 2024
150k/year USD / Month

Filevine

Data Scientist

About the job:
Full-time
About Filevine:

Filevine is changing the way legal work gets done for law practitioners and their clients. As the leading legal operating system, Filevine is dedicated to empowering organizations with tools to simplify and elevate complex, high-stakes legal work. Powering everything from document and case management to timekeeping, billing and business analytics, over 3,400 law firms and legal teams use Filevine daily to deliver excellence.

2023 was a groundbreaking year for Filevine, as we launched a suite of AI-powered features that are transforming the legal industry.

– LeadsAI helps law firms evaluate cases faster, analyze client sentiment, identify potential problems, and predict case success.

– DemandsAI is an AI-driven demand letter generation solution that helps law firms prepare demand letters more quickly and accurately.

– ImmigrationAI streamlines the immigration process by automating tasks, reducing errors, and ensuring accuracy.

– AI Fields is a powerful tool that can enhance legal work by minimizing manual tasks, facilitating fact-checking, and quickly answering complex queries.

With these groundbreaking AI features, Filevine is empowering law firms and legal teams to deliver excellence to their clients with unprecedented speed and efficiency.

We are seeking a highly skilled and experienced Data Scientist with expertise in Natural Language Processing (NLP) and classification model sets to join our dynamic team. As a Data Scientist, you will play a crucial role in developing and implementing advanced algorithms and models to extract insights from complex textual data and solve challenging business problems. You will have the opportunity to work on cutting-edge projects and contribute to the development of innovative solutions.

Responsibilities:
Design, develop, and implement NLP algorithms and techniques for text preprocessing, feature extraction, sentiment analysis, topic modeling, named entity recognition, document classification, and other related tasks.
Develop robust classification models and frameworks using state-of-the-art machine learning and deep learning techniques for various applications, including document categorization, text classification, sentiment analysis, and recommendation systems.
Help define workflow and data stores to get data out of unstructured stores and into usable Data Science formats.
Collaborate with cross-functional teams, including product owners, software developers, and domain experts, to understand business requirements and develop end-to-end solutions.
Perform exploratory data analysis and visualization to gain insights into textual data, identify patterns, and inform feature engineering and model development.
Evaluate and compare the performance of different (but not limited to) NLP, generative and classification models, and propose enhancements or modifications to improve their accuracy, efficiency, and scalability.
Stay up-to-date with the latest advancements in machine learning methodologies, techniques, and frameworks, and apply them to solve complex business problems.
Communicate findings, insights, and technical concepts effectively to both technical and non-technical stakeholders through reports, presentations, and visualizations.
Support implementation of analytics tools and methodologies within our engineering tech stack.
Requirements:
Master’s or Ph.D. degree in Computer Science, Data Science, Statistics, or a related field.
Strong background and expertise in Natural Language Processing (NLP) techniques, including text preprocessing, feature extraction, sentiment analysis, topic modeling, named entity recognition, and document classification.
Proven experience in designing and implementing classification models and algorithms, such as Naïve Bayes, Logistic Regression, Support Vector Machines (SVM), Random Forests, Gradient Boosting, and Neural Networks.
Proficiency in programming languages such as Python, Spark or Java, and libraries/frameworks such as NLTK, SpaCy, scikit-learn, TensorFlow, or PyTorch.
Experience with data manipulation, analysis, and visualization using tools such as Pandas, NumPy, Matplotlib.
Strong understanding of statistical analysis and machine learning principles, and ability to apply them to real-world legal problems.
Solid knowledge of software development practices, version control systems, and agile methodologies.
Excellent problem-solving skills, analytical thinking, and attention to detail.
Effective communication skills and ability to collaborate in a team-oriented environment.
Proven track record of delivering high-quality results on time and effectively managing high profile projects and priorities.
Preferred Skills:
Experience with true big data (exabytes and higher) procession practices.
Knowledge of cloud computing platforms such as AWS, Azure, or GCP.
Ability to mentor and educate on Data Science deployment and best practices to technical groups
30-Day Goals: Understanding our Data and defining Standard Fields to Fuel Settlement Prediction
Understanding our Data and defining Standard Fields to Fuel Settlement PredictionConduct an in-depth analysis of the unstructured JSON data corpus to understand its characteristics, key attributes, and potential challenges in our Filevine Core dataset to help define a Standard Fields approach to leverage/use unstructured data.
Develop a data preprocessing pipeline to clean, normalize, and transform the JSON data into a structured format suitable for NLP and text classification tasks on specific data sets using AWS Sagemaker.
60-Day Goals: Adding Value to our Data by Creating Standard Fields to Fuel Settlement Prediction
Develop and fine-tune NLP models for tasks such as named entity recognition, topic modeling, and text categorization using the unstructured data from Filevine Core for standard fields (global).
90-Day Goals: Adding Value to our Data by Creating Standard Fields to Fuel Settlement Prediction
Continue to develop and fine-tune NLP models for tasks such as named entity recognition, topic modeling, and text categorization using the unstructured data from Filevine Core for standard fields (local).
6 Month Goal: Standard fields are being used to generate a Settlement Prediction/Amount in Beta testing
Have Standard Fields (Local and Global) available for DS/Analytics use from Filevine core data.
Create a model that leverages standard fields (features) to predict settlement likelihood and settlement amount.
$150,000 – $180,000 a year

The base salary range represents the low and high end of the salary range for this position. The total compensation package for this position will be determined by each individual’s location, qualifications, education, work experience, skills and performance. We believe in the importance of pay equity – the range listed is just one component of Filevine’s total compensation package for employees. Other rewards may include commissions, stock options, a paid time off policy, as well as a comprehensive benefits package, including medical, dental, and vision.

Cool Company Benefits:

– A dynamic, rapidly growing company, focused on helping organizations thrive

– Medical, Dental, & Vision Insurance (for full-time employees)

– Competitive & Fair Pay

– Maternity & paternity leave (for full-time employees)

– Short & long-term disability

– Ergonomic and height-adjustable workstations for onsite employees

– Opportunity to learn from a dedicated leadership team

– Weekly Taco Lunches in the summer/fall/spring for onsite employees

– Centrally located open office building in Sugar House

– Flexible hybrid work schedules depending on the department with some departments offering fully remote positions in the United States (R&D)

– Top-of-the-line company swag

To apply for this job please visit jobs.lever.co.

Data Scientist (NLP and Classification Expert)

Related

Junior+/Middle Product Analyst – Serbia / Almaty / Armenia / Baku / CIS / Georgia

Manager, Solution Consulting – London, England