WHO
WHAT
HOW
WHY
Let's start with who created artificial intelligence. Almost universally, Alan Turing is credited as the first person to theorize and question whether a machine could think. Since 1950, progress has been made in creating increasingly complex representations of the human brain (or not), building various superstructures to try and understand if and how a machine can achieve that result. Since then (apart from a few cold winters), we have never stopped, aware that simple silicon chips could and should be capable of solving complex problems.
Raw data, sad 1s and 0s. Although new innovations describe AI as capable of understanding audio, text, and video, the truth is a bit more mundane than that. New systems hide automatic transformations of our inputs into data arrays (or tensors), simple for models to digest. In reality, little has changed since the 90s regarding what constitutes the inputs and outputs of Machine Learning models; these are still numerical. However, we've become much better at transforming and combining them. Similarly (with rare exceptions), the models have remained similar, and what has changed is the computational power available to them.
Okay, someone might get upset here, but how these models work is rather trivial... through the use of mathematical formulas. That's it. No God complex, no weaving of the Fabric; they are just a set of rather complex formulas where room has been left for errors and the emergence of biases to allow for learning. This is clearly made possible by computational power and years of study, which is why developing custom models is often expensive, but similarly, it's not always necessary to build your own LLM from scratch.
By now, the reason should be clear: analyzing large volumes of data is complex, and machines are generally better at it than we are, especially when it comes to classifying elements or clustering based on thousands of characteristics. The ultimate goal remains the same: transforming a large amount of raw data into value. Sometimes, it's enough to extract insights from a set of different CSVs; other times, it involves building complex data ingestion pipelines, an entire ETL, and interactive dashboards. Still other times, the value is a chat capable of creating kitten images. We handle all of this: we take messy data, apply a lot of mathematics, and transform it into value for you.