More data is not necessarily better data, whatever the source.
Claims based on lots of data can be misleading. Sometimes this is called “big data” (data from large databases) or “real world data” (routinely collected data). Unfortunately, routinely collected data often does not include data about “confounders”.
Confounders are factors other than the treatments being compared that can affect the outcomes. For example, a study might compare people who are more active to people who are less active to find out if being more active helps people lose weight. If the people who are more active eat less, that would be a confounder, since how much people eat can affect weight loss.
When using routinely collected data, it is only possible to control for confounders that were already known and were measured when the data was collected. So, we cannot be sure that an association between a treatment and an outcome means that the treatment caused the outcome, rather than confounders.