
Constructing and leveraging strategic datasets
Curated datasets are often needed to support your organization's every strategic activity from delivering differentiated production AI applications to making informed operational decisions.We can help you build curated datasets as well as the tools that allow you to employ that data to further your mission and accelerate your business.Take a look around or contact us to see how.
Recent Projects
© 2024 Numantic Solutions LLC - All Rights Reserved

Services
Numantic Solutions is a data consultancy helping organizations leverage their data assets. We can assist throughout the data lifecycle from the collection and curation of data to the analysis of data to gain insights to the use of analytical results in setting strategy to the deployment of machine learning models in production environments.
Numantic - A portmanteau of numeric and semantic, reflects our focus on delivering quantitative and programmatic solutions that assist in addressing real-world problems and capitalizing on untapped opportunities. Our advanced product management and software engineering expertise provide a foundation for helping clients define objectives and deliver data solutions supporting those objectives.
Why hire us? - If you are in the early stages of building a data-science foundation, our services provide many advantages including ⦿ Lower costs as compared to hiring full analytical and engineering teams,
⦿ Faster delivery through leveraging our previous experiences in building and deploying data-science solutions and in developing data-ingestion and curation systems,
⦿ More valuable solutions through a design process that utilizes our backgrounds in technical product management and research design.If you already have an established data science environment and process, we can assist through
⦿ Augmenting your teams with senior programming, modeling, machine-learning and AI resources,
⦿ Complementing your data-science approaches and methodologies with new perspectives from broad industry and project experience,
Clients - We work with diverse organizations of all sizes across a wide range of industries and locations. Many of our clients are in the social impact space, and we're passionate about helping them help the world.
Location - We are located in the United States just south of San Francisco in Palo Alto, California, but we work with clients all over the globe.

Core Comptencies
We can help in these areas, but to summarize we have extensive technical, product and research experience:
Advanced Data Engineering – Data engineering, ETL processes, structured and unstructured types, storage and management, and database design and optimization
Machine Learning & AI Solutions – Intelligent model development for generative AI, predictive analytics, natural language processing, and image handling
Custom Data Visualization & Dashboards – Interactive and intuitive analytics using open-source and cloud tools
Sensor & Measurement Analytics – Processing and analysis of real-time sensor data for informed decision-making
Technical Product Management – End-to-end development and execution of product strategies leveraging data
Research & Development – Cutting-edge innovation in statistical modeling, automation, and AI-driven applications
Tools and Technologies
We've worked extensively with these tools, but we're happy to quickly learn new things:
Languages - We mostly work in Python and SQL, but we've also used other languages such as R, C++, JavaScript & VisualBasic
Libraries & Frameworks - Beautiful Soup, Django, Fake Data, Flask, Geopandas, Matplotlib, Pandas, Plotly, Puppeteer, Seaborn, Selenium
AI & Machine Learning - ChatGPT (API), Gemini (API), Gensim (Doc2Vec), Google Agent Development Kit (ADK), Google's TensorFlow, Hugging Face, LangChain, Microsoft's Platform for Situated Intelligence (PSI), PyTorch, Scikit-learn, Stable Diffusion
Databases - BigQuery, MySQL, Postgres, Snowflake, Teradata
Cloud Computing - Amazon Web Services (AWS), Google Cloud Platform (GCP)
Application Programming Interfaces (API) - CenPy, Enimoh, GitHub (API), Google Docs (API), MS Office 365 (API), PRAW (Reddit API), Tweepy (X/Twitter API), Zinc
Tools - Asana, Confluence, GitHub, Google Colab, Jira, Jupyter, Notion, PyCharm

Contact Us
Have challenging data? Do you need extra analytic expertise or just have questions for us? Please drop us a line.We're interested in learning more about your organization and the potential for data in helping advance its objectives. There's no fee or commitment for an introductory, solution-consulting conversation; we'll just listen, learn and provide honest, objective thoughts on next steps.Contact with us via LinkedIn or GitHub or by submitting the form below.
Please drop us a note; we'd like to hear from you.
Stephen Godfrey

⦿ Steve Godfrey is a technical product manager and data scientist. In those roles, he helps organizations leverage data and analytics to improve their products and services and to support their missions.He frequently writes and speaks about tools and methods to manage, analyze and model datasets as well as applying data analytics in real-world use cases.His professional career has been in financial and technology services working for investment management, banking and financial technology firms including PayPal in data science and Wells Fargo and Bank of America in currency trading, global payments and risk-management research. In addition, his volunteer work includes evaluating statistical techniques, constructing machine-learning pipelines and software consulting for organizations such as Statistics Without Borders and the Technology Accelerator Fund.He holds an MBA and BS Mathematics/Applied Economics from UCLA and completed a data-science bootcamp. Steve lives in Palo Alto, CA with his wife, has two grown children and enjoys travel and outdoor activities and working with wood and in the garden.
Nathan Moroney

⦿ Nathan Moroney is an experienced research engineer. He has worked across organizations to invent, develop and ship a range of technologies. Nathan has written over 90 papers and patents with over 1,000 combined citations. He has been collecting and analyzing web-based data sets since before the term crowdsourcing was coined.He started as a research consultant at the RIT Research Corporation and then joined Hewlett-Packard Español (Barcelona) as a writing systems engineer. Nathan then transferred to HP Labs (Palo Alto) where he contributed to imaging, data, 3D, sensor and machine learning projects. His accomplishments as a principal scientist were recognized with the awarding of Fellow status at the Society for Imaging Science and Technology.Nathan has a BS degree from Jefferson University and Master of Sciences degree from the Rochester Institute of Technology. Both of his degrees are in the field of color science and his masters research topic was color space selection for JPEG image compression. He and his wife are Palo Alto residents with two grown children. When he cooks, it is often with one (or more) skillets and his current deep reading project is about rivers of the world.
California Community College Policy Assistant

⦿ Overview: An AI tool providing in-depth information to stakeholders in policy decisions affecting California Community Colleges⦿ Our contribution: The CCC Policy Assistant chatbot is an artificial intelligence (AI) agent trained on policy topics related to California's Community Colleges.The bot's target audience are stakeholders who would like to participate in community college decision making and would benefit from curated and detailed information related to community colleges. Some examples might include board members, administrators, staff, students, community activists or legislators.Our goal in creating it is to demonstrate how advocates can use technology to help further their missions.Some sample questions include:
- How many districts are there in the California community college system?
- What is the total enrollment of Foothill college?
- What college is designated a Center of Excellence in bioprocessing?
- How many California community colleges partner with the California Department of Corrections and Rehabilitation (CDCR) to provide in‑person courses?
- What are the responsibilities of the board members of a California community college?⦿ Resources:
- Chatbot
- GitHub
Mental Health Web Application for Schools

⦿ Overview : A data-driven web application helping schools improve their mental health, trauma, social emotional learning and equity student supports.⦿ Our contribution : Using Python and cloud database tools, we helped build pipelines to collect, clean and enhance both publicly available and privately procured data. Data are stored in a data warehouse within a cloud data science environment and are available research, analysis and pre-processing before being passed to a front-end web application for presentation to school users. In addition to data analytics, the Numantic team provided product management expertise and assistance in designing early versions of the application.⦿ Resources:
- RSSI
Social Media Sentiment

⦿ Overview: A natural language processing (NLP) sentiment analysis tool pulling relevant Reddit posts and X (Twitter) tweets and assessing sentiment about homelessness in the Capital Region of British Columbia. This project was a collaboration between Statistics Without Borders and the Alliance To End Homelessness in the Capital. The work is being presented at the 2024 Joint Statistical Meetings (JSM).⦿ Our contribution: Using Python, we built a data-collection pipeline that employs application programming interfaces (API) from X (Twitter) and Reddit to collect tweet and post data and machine learning models to assess their relevance and sentiment. The pipeline runs on a Google Cloud instance and results are published to a public website.⦿ Resources:
- GitHub
OpenGazette - Patent Drawing Retrieval Tool

⦿ Overview: Thousands of new patent drawings are published each week by the US Patent Office – how can we better search this collection of images?⦿ Our contribution: We leverage image vector embeddings to create a database of weekly Patent Gazette drawings. This database is then queried using text descriptions, example images or categorical prompts. The embeddings and retrieval are performed using a Python toolbox running locally to achieve more stringent privacy requirements.⦿ Resources:
- GitHub
Stable Diffusion Color Analysis

⦿ Overview : How many color terms does a text to image generative AI learn through its training process?⦿ Our contribution : We developed batch mode prompt rendering using Python and proposed an initial automated analysis process. The analysis uses depth contours to estimate regions of interest and quantifies color differences with respect to a million-color term human corpus. We then use these results to estimate the range of color terms available for use.⦿ Resources:
- Conference Proceedings Paper
SWB & IMPACT INITIATIVES
⦿ Overview : What statistical methods should be employed to account for survey design effects?⦿ Our contribution : In this blog, we discuss a recent collaboration between Statistics Without Borders (SWB) and IMPACT Initiatives addressing survey design effects in the Multi-sectoral Needs Assessments (MSNA) conducted by IMPACT in countries experiencing humanitarian challenges and crises.⦿ Resources:
- Blog
AI Data Analysis
⦿ Overview : How can current AI tools be used by a data analyst?⦿ Our contribution : In this electronic book on AI Data Analysis, we explore using Large Language Models (LLM) to analyze datasets with the goal of developing some operating instructions or guidelines on how to effectively use these tools in the data-analysis task.⦿ Resources:
- Blog
Web-Scraping Pipeline
⦿ Overview : What pandemic-relief resources are available to small businesses?⦿ Our contribution : This video and paper cover a collaboration between Statistics Without Borders (SWB) and Client to Consultant Bridge (C2CB) to develop and deploy a multistage data pipeline to automatically curate a national list of small-business aid programs during the COVID-19 pandemic shutdown period. The pipeline's results were presented at a website and helped small business users efficiently research and find relevant aid programs.⦿ Resources:
- Chance Journal Paper
A TensorFlow Modeling Pipeline Using TensorFlow Datasets and TensorBoard
⦿ Overview : How can data scientists add rigor and structure to their modeling experiments?⦿ Our contribution : This blog post discusses approaches and tools that can help machine learning modelers organize, evaluate and record their experiments with algorithms and configurations. The post explores tools though a practical demonstration in malaria image-detection use case.⦿ Resources:
- Blog
Selected Presentations and Writings
Presentations
Writings
Privacy Policy for Numantic Solutions LLC
Effective Date: August 5th 2024Welcome to Numantic Solutions LLC, accessible from www.numanticsolutions.com. At Numantic Solutions LLC, the privacy of our visitors is of extreme importance to us. This Privacy Policy document outlines the types of information that is received and collected by www.numanticsolutions.com and how it is used.1. Information we collect
Numantic Solutions LLC does not collect any personal information from visitors of our website. Users can contact us directly via email through the links provided on our website without submitting any personal data through the website itself.2. Log Files
Like many other websites, www.numanticsolutions.com makes use of log files. These files merely log visitors to the site - usually a standard procedure for hosting companies and a part of hosting services' analytics. The information inside the log files includes internet protocol (IP) addresses, browser type, Internet Service Provider (ISP), date/time stamp, referring/exit pages, and possibly the number of clicks. This information is used to analyze trends, administer the site, track user's movement around the site, and gather demographic information. IP addresses and other such information are not linked to any information that is personally identifiable.3. Cookies and Web Beacons
Numantic Solutions LLC does not use cookies.4. Third-Party Privacy Policies
Numantic Solutions LLC does not use or engage with any third-party services that collect information on our behalf, nor do we share any user information with third-party entities. This includes third-party analytics tools and third-party APIs that typically collect personal data.5. Security
The security of your personal information is important to us, but remember that no method of transmission over the Internet, or method of electronic storage, is 100% secure. While we strive to use commercially acceptable means to protect your personal information, we cannot guarantee its absolute security as we do not collect personal information.6. Children's Information
Numantic Solutions LLC does not knowingly collect any personally identifiable information from children under the age of 13. If you believe that your child provided this kind of information on our website, we strongly encourage you to contact us immediately, and we will do our best efforts to promptly remove such information from our records.7. Consent
By using our website, you hereby consent to our Privacy Policy and agree to its terms.8. Updates
This Privacy Policy may be updated from time to time in order to reflect changes to our practices or for other operational, legal, or regulatory reasons. We encourage visitors to frequently check this page for any changes. Your continued use of our website following the posting of changes to this policy will be deemed your acceptance of those changes.
Terms of Service for Numantic Solutions LLC
Effective Date: August 5th, 20241. Acceptance of Terms
By accessing and using the website www.numanticsolutions.com, you accept and agree to be bound by the terms and provision of this agreement. In addition, when using this website's particular services, you shall be subject to any posted guidelines or rules applicable to such services. All such guidelines or rules are hereby incorporated by reference into the Terms of Service.2. Service Description
Numantic Solutions LLC a data consultancy helping organizations leverage their data assets.. These services are accessible via www.numanticsolutions.com. This website may also provide links to communicate via email, but does not collect personal information directly.3. User Responsibilities
You are responsible for maintaining the confidentiality of any login information associated with any account you use to access our services. Accordingly, you are responsible for all activities that occur under your account/s.4. Privacy Policy
Our Privacy Policy, which sets out how we will use your information, can be found at www.numanticsolutions.com/#privacy . By using this website, you consent to the processing described therein and warrant that all data provided by you is accurate.5. User Conduct
You agree not to use this website to send or post any material that is unlawful, harmful, threatening, abusive, harassing, defamatory, vulgar, obscene, or otherwise objectionable. This includes, but is not limited to, any material that encourages conduct that would constitute a criminal offense, give rise to civil liability, or otherwise violate any applicable local, state, national, or international law.6. Modifications to Service
Numantic Solutions LLC reserves the right to modify or discontinue, temporarily or permanently, the service with or without notice to the user. Such changes will be posted on the website directly or provided to you via email.7. Intellectual Property Rights
You acknowledge that all content and materials available on www.numanticsolutions.com are protected by copyrights, trademarks, service marks, patents, trade secrets, or other proprietary rights and laws. Except as expressly authorized by Numantic Solutions LLC, you agree not to sell, license, rent, modify, distribute, copy, reproduce, transmit, publicly display, publicly perform, publish, adapt, edit, or create derivative works from such materials or content.8. Termination of Use
You agree that Numantic Solutions LLC may, in its sole discretion, terminate or suspend your access to all or part of the website with or without notice and for any reason, including, without limitation, breach of these Terms of Service.9. Governing Law
Any disputes arising out of or related to these Terms and Conditions and/or any use by you of the Site shall be governed by the laws of the State of California, without regard to the conflicts of laws provisions therein.10. Date of Last Update
This agreement was last updated on August 5th, 2024.