The Brain Sentiment Indicator (BSI) measures the “mood” on more than 10000 global stocks, major sectors, currencies, commodities and crypto based on the analysis of financial news using Natural Language Processing techniques. As a trial we provide approximately five years of history for a subset of assets and upon request a 30 days access to the daily report corresponding to the full universe.
Brain Machine Learning proprietary platform is exploited to generate a daily stock ranking based on the predicted returns of a universe of the largest US and European stocks on four time horizons: 2, 3, 5, 10 and 21 days (other time horizons could be developed and tested upon request). The model implements techniques to reduce the well-known overfitting problem for financial data.
Brain Market Sentiment (BMS) provides a daily a score on the general mood of the market by automatically clustering by topic thousands of news from most popular financial media. The sentiment of each topic is calculated using Brain proprietary Natural Language Processing platform.. The BMS provides an aggregate score for the news topic sentiment of the current day.
Risk ON / Risk OFF signals based on VIX statistical indicators (Dynamic Volatility Signal) and including measures of financial stress indicators and macro-economic environment (Brain Dynamic Allocation Indicator). Portfolio strategies based on these signals are implemented: the strategies toggle between two dynamic portfolios, each of which is monthly rebalanced.
The Brain Language Metrics on Company Filings (BLMCF) dataset has the objective of monitoring several language metrics on company filings for 6000+ US stocks. Some examples of such metrics are financial sentiment, percentage of words belonging to financial domain classified by language types (e.g. “litigious” language), readability, similarity between documents. The analysis is available for the whole report and for specific sections (e.g. Risk Factors and Management Discussion and Analysis).
The Brain Language Metrics on Earnings Calls Transcripts (BLMECT) dataset aims to monitor various language metrics for the quarterly earnings call transcripts of 4,500+ US stocks. These metrics include financial sentiment calculated using LLM, the percentage of words classified by language type (e.g., "uncertainty" language) within the financial domain, readability, and similarity between transcripts. The analysis is available for specific sections: Management Discussion, Analysts’ Questions, and Management Answers to Analysts’ Questions.
Brain Wikipedia Page Views monitors the number of views of the Wikipedia pages as a proxy for the interest to the top largest 1000 US companies. The raw number of views is monitored using the “buzz” metric to asses if a company is receiving more visits than usual on various time horizons. The goal is to provide an alternative way to measure the attention of investors toward a specific company; this is complementary to the attention metrics measured from news or other sources.
The Brain News Topics Analysis dataset exploits an internal and customized large language model to monitor specific topics and their sentiment within the financial news flow for stocks. For example, an investor may want to identify all news related to the topic “innovation” for a set of companies and track their sentiment with respect to each specific topic. Similarly, another investor can be interested in tracking all news related to the topic “risks for the company” and their sentiment.
The Brain Combined Sentiment dataset provides aggregated sentiment metrics for U.S. stocks, derived from multiple textual financial sources, including news articles, earnings calls, and filings. These metrics are calculated using different Brain datasets and combined into a single file for the convenience of investors interested in stock sentiment analysis. Additionally, a combined sentiment score is provided, calculated as the average sentiment across all sources.
Brain Market Monitor Dashboard allows the monitoring of markets through Brain proprietary signals and a snapshot of Brain alternative datasets.
This a very useful tool for the investor to augment its awareness in the decision process with a complementary view to common market data.
Algorithm-based selection among a large database of companies of a basket of stocks whose business is related to a specific theme (es. "nanotechnology"). The selection is performed by analyzing company public documents by leveraging on natural language processing and machine learning classifications.
Brain leverages its proprietary Machine Learning infrastructure for the validation of alternative datasets; given a new dataset we are going to integrate this into our existing machine learning model for stock ranking and during the procedure we will evaluate a series of validation metrics to assess if the new data brings alpha.
With Unsupervised Machine Learning techniques our system identify non trivial patterns among a large number of financial and macroeconomic data to find past days which are “similar" to the current scenario. Investment models can be built by analyzing the performance of various assets on the market clusters identified by the system.
Brain Cybersecurity Basket is built to select companies focused on protecting enterprises and electronic devices from unauthorized activities through specific software and other electronic means.
Brain 3D Printing Basket selects companies focused on either producing 3D printers or creating specialized components, including the modelling software for 3D printers for all applications.
Brain Cell Therapy Thematic Basket selects companies focused on the application of cell therapy to cure patients in which viable cells are injected, grafted or implanted in order to effectuate a medicinal effect.
Brain Nanotechnology Thematic Basket selects companies focused on technologies that enable or perform manipulation of materials at a nano- or microscale for applications in electronics, energy, biomedicine, environment and others.
Brain products and solutions leverage on Natural Language Processing techniques (NLP) to extract from structured and unstructured texts meaningful metrics such as sentiment, language complexity and topics. In the context of NLP we use various machine learning techniques to assess the relevance of a company document (e.g. text extracted from web site) with respect to a specific theme (es. “nanotechnology” or “robotics”) or to identify the relevant topics in documents.
Brain has developed a proprietary Large Language Model customized for various tasks in financial data analysis. Some examples include: accurate sentiment analysis of news in equity, cryptocurrency, and commodity sectors; summarization of earnings calls; creation of vertical thematic baskets; topics extraction and classification. The model operates on internal Brain GPU infrastructure ensuring efficient processing and full data privacy.
Brain has developed a set of Machine Learning and financial features engineering tools aiming at providing inference on the markets. Our models yield statistical predictions on targets such as assets returns; using ensemble machine learning models we can calculate probabilities associated to the spectrum of predictions. These tools can be used as building bricks for investment strategies or for proprietary and third parties’ portfolio models.
Brain combines various clustering algorithms together with dimension reduction techniques to extract relevant features and to cluster various types of data sets, for example all company documents by topic or the past history of market days in order to extract meaningful information.
Brain has developed a proprietary backtesting and validation approach that we use to test and optimize our models, so that our results are less dependent on the specific historical trajectory markets have undergone. The method can be used also to validate or to optimize third parties’ models.
Brain assists Investment Management firms in the development of their proprietary algorithms.
[29/05/2020 - IEX Cloud Blog] Current times are indeed turbulent again, and for better or worse, financial markets are displaying their high potential to react to these changes on a global scale. Given this new reality, when it comes to data, it's increasingly valuable to be able to identify and process new information, spot emerging relevant topics, and assess their potential impact on financial markets and the economy. To achieve this, investors are ...
[16/12/2019 - Crux Informatics Blog] Brain is a research company that develops proprietary signals and algorithms for investment strategies. Brain also supports clients in developing, optimizing and validating their own proprietary models.
The Brain platform includes Natural Language Processing (NLP) and Machine Learning (ML) infrastructures which enable clients to integrate state-of-the-art approaches into their strategies ...
[SSRN paper by Matúš Padyšák, Quantpedia] This research studies the similarity of language used in the filings using data which enables to analyze what type of language is similar. Results show that the similarity of the positive language is the most profitable option. From a practical point of view, the positive similarity effect is examined. Results show that the lowest positive similarity stocks significantly outperform the highest positive similarity stocks. The effect cannot be explained by the common asset pricing models ...
[SSRN paper by D.F. Ahelegbey, P. Cerchiello, R. Scaramozzino, Dept Economics, Pavia University]
How much the largest worldwide companies, belonging to different sectors of the economy, are suffering from the pandemic? Are economic relations among them changing? In this paper, we address such issues by analysing the top 50 S&P companies by means of market and textual data. Our work proposes a network analysis model that combines such two types of information to highlight the connections among companies ...
[SSRN paper by D. Hanicova, F. Kalús and R. Vojtko, Quantpedia] This paper analyzes the application of natural language processing (NLP) on the 10-K and the 10-Q company reports. Using the Brain Language Metrics on Company Filings (BLMCF) dataset, which monitors numerous language metrics on 10-Ks and 10-Qs company reports, we analyze various lexical metrics such as lexical richness, lexical density, and specific density.In simple words, lexical richness says how many unique words are used by the author...
[SSRN paper by R. Vojtko and D. Hanicova, Quantpedia] Various research shows that market sentiment, also called investor sentiment, plays a role in market returns. Market sentiment refers to the general mood on the financial markets and investors' overall tendency to trade. The mood on the market is divided into two main types, bullish and bearish. Naturally, rising prices indicate bullish sentiment. On the other hand, falling prices indicate bearish sentiment. This paper shows various ways to measure market sentiment ...
[SSRN paper by C. Dujava, F. Kalús and R. Vojtko, Quantpedia] Post–earnings-announcement drift (abbr. PEAD) is a well-researched phenomenon that describes the tendency for a stock’s cumulative abnormal returns to drift in the direction of an earnings surprise for some time (several weeks or even several months) following an earnings announcement. There have been many explanations for the existence of this phenomenon. One of the most widely accepted explanations for the effect is that investors ....
Interview to Brain founders on The Alternative Data Podcast by Mark Flemming Williams.
On this show we reveal some of the latest developments in the fast-growing Alternative Data sector.
In this episode I speak to Francesco Cricchio and Matteo Campellone of BRAIN, the provider of alternative data signals. After successful careers in physics research, Francesco and Matteo decided to join forces to create products to help investors in the markets ...
[Quantconnect use case by D. Melkin based on Brain Sentiment Indicator dataset] In this use case, we monitor the news sentiment for the constituents of 25 different sector Exchange Traded Funds (ETFs). We periodically rebalance the portfolio of ETFs to maximize our exposure to the sectors with the greatest public sentiment. The results show that the strategy consistently outperforms several benchmark approaches. Sector rotation is a strategy where you move capital among a set of sectors ...
[White paper by R. Tepelyan, Bloomberg Enterprise Quants] Global equity markets entered a period of increased volatility in early 2020 due to the onset of the Covid-19 pandemic. This volatility provides a natural experiment for evaluating the investment performance of signals derived from alternative data. In this paper, we apply simple, intuitive rules to alternative data signals and generate a portfolio with superior risk and return characteristics. We start with a portfolio that uses the signals separately and demonstrate ...
[Extracting Structured Datasets from Textual Sources - Some Examples, Matteo Campellone and Francesco Cricchio, Brain] We hereby present some examples of information extraction from textual sources such as news, company regulatory filings or earning calls transcripts. For the company filings we refer to some recent lit- erature arguing the existence of unexploited information in these documents. We present three Brain datasets that provide several measures on various textual sources ...
Matteo Campellone holds a Ph.D. in Physics and a Master in Business Administration. Matteo’s past activities included Financial Modeling and Risk Management for financial institutions as well as Corporate Risk and Value Based Management for industrial companies. As a Theoretical Physicist he worked in the field of statistical mechanics of complex systems and of non-linear stochastic equations. Amongst other results, he put forward some new solutions for the finite size corrections to an universality class of Spin Glass models, and developed an approximation method to approach some non-linear stochastic equations.
Francesco Cricchio obtained his Ph.D. in Computational Physics applied to Quantum Physics from Uppsala University in 2010. He is the author of several scientific publications on the prediction of material properties from computer simulations with focus on superconductors and magnetic compounds. In 2009 one of his publications has been awarded the cover of Physical Review Letters. He focused his career in solving complex computational problems in different sectors using a wide range of techniques, from density functional theory in the domain of solid state physics to the application of machine learning methods and advanced statistics in the industrial domain.
Michael Burnett has an MBA in finance and strategy from London Business School and a Bachelor of Science from the University of Southern California where he attended on academic scholarship. Michael’s career has spanned technology and finance, working for companies such as Apple, Cisco and Yahoo! and working in investment banking where he closed more than 45 transactions with media and technology companies totaling more than $25 billions. Michael has been invited and guest lectured at New York University (Stern School of Management), SDA Bocconi and Università Cattolica.
Simone Conradi obtained a Ph.D. in Theoretical Physics focusing his research activities on Lattice Quantum Chromodynamics using methods of Computational Physics. He specialized in statistical physics and in thermodynamics of quantum field theories applied to fundamental matter, achieving new insights about the confining properties of quarks and gluons at finite temperature and density. Moreover, he has a ten years long career in the railway industry, with focus on the development of human safety relevant systems and in the management of railway diagnostic data. He teaches computer science and AI in secondary schools and he authored the book "Intelligenza Artificiale", Zanichelli 2022.
Alessandro Sellerio is a theoretical physicist, specialized in novel applications of the methods of traditional statistical mechanics. He obtained a MSc in Physics on evolutionary models for genomes, and a Ph.D. focused on glass transitions in granular media, using analytical models, simulations and experiments. He has extensive experience in the fields of statistical and condensed matter physics and complex systems. He has been lecturer on computational techniques such as molecular dynamics and machine learning. In the private sector he has worked in the fields of Finance, developing statistical models and machine learning systems, and as devops software engineer in industrial automation and testing.
Gabriele Perugini graduated in Physics at the Rome University “la Sapienza” in 2013 where he was awarded with the “excellence class fellowship” and the prize for the best graduate of the year given by the alumni association. In 2017 he obtained a Ph.D. in Theoretical Physics from the same University. During his Ph.D. he worked on the statistical Physics of disordered systems, non-convex optimization problems and message-passing algorithms.
Since 2018 he is a PostDoc at Bocconi University where he does research in machine learning and artificial neural networks. Recently he also got interested in empirical deep learning and biologically-plausible optimization algorithms.
Giorgio Rossi is a Master's student attending Physics of Complex systems at Università di Torino. His range of competences go from machine learning theoretical foundations to its latest applications, especially to socio-economic problems. With the aim of achieving the most complete picture of the context possible, he approached the world of Agent-Based and Multi-Agent based models. He is currently writing his master thesis in Körber Tissue on one of the most relevant applications of Artificial Intelligence in the Industry 4.0 field, working in Tissue area and boosting performance of a converting line preventing the production from having unplanned downtimes.