Databases designed for specific purposes often fail when asked to solve a different problem. As an example, the securities finance databases of leading data providers such as FIS Astec, Datalend, and IHS Markit, designed more than 20 years ago for performance benchmarking, are inadequate when queried for the purpose of the loans themselves. Even regulatory databases enriched with new SFTR filings to help authorities monitor leverage, are unable to determine the propriety of the loans.
None of the existing databases were intended or designed to map loans edge-to-edge, that is, from the principal lender to the principal borrower. Usually the loan of securities is made by a pension or mutual fund through a series of financial intermediaries to the ultimate borrower, which is generally the trading desk at a hedge fund or broker-dealer. The fungibility of securities allows the systems of the intermediaries to pool the loans and distribute the borrowed securities through a highly-efficient netting system that breaks the chain of loans and borrows. As a result, it is extremely difficult to link the source and use of the borrowed securities.
A full mapping is needed to determine the purpose of the borrow. But that cannot be done with existing databases. The private and regulatory databases rely on loan reports that terminate at the prime broker, not the hedge fund. Without a connection to the true demand source, i.e., the trading desk, it is impossible to determine the purpose and therefore the propriety of the loan. The fluidity of the current market infrastructure adds unpredictable fluctuations to the degree that securities lending activity relates to short selling, especially when attempting to forecast published short interest.
IHS Markit has published a paper on the January 2021 short squeeze that candidly explains the problem as two-fold. First, the ability of securities purchasers to on-lend their newly-acquired positions means that more than 100% of the share float can be on loan at any one time. Second, the ability of prime brokers to use internal resources, such as hedge fund long positions, means that loans to hedge fund short sellers do not always correspond closely to their borrows from agents for the lenders.
Any suggestion of change from the prior short interest has the potential to introduce error, so a substantial recognition of changes in shares on loan should only be done when the two series are highly correlated, grading slowly toward a very limited reliance on equity finance data where there is a low expectation for forecasting success. In this view, the forecast performed as expected with the inputs available. It would have been possible for the Jan 29 short interest to print at 50m shares, which would have been interpreted as a substantial uptick in dealer inventory, likely the result of an increase in hedge fund longs (possibly also some index related Delta-1 longs). Given the events which unfolded over the last week of January, along with the decline in shares on loan, that may have been deemed unlikely, but is important not to discount as a possibility when considering the model output. 
In the Markit paper, the tracking error between short interest published by the exchanges and the loan interest from agents is shown to be off by as much as 20% or more. Therefore, using existing databases for anything other than relative agent performance introduces a significant error factor.
The supply of shares from beneficial owners in securities lending programs can be tracked as a real-time indication of availability from institutional owners of shares, while the gap between the exchange short interest and borrowed shares provides an indication of shares sourced by broker dealers away from the traditional securities finance channel.
1. Sam Pierson, "Short squeeze by the numbers," IHS Markit, 12 February 2021, at https://ihsmarkit.com/research-analysis/short-squeeze-by-the-numbers.html?