Anatec AI blog: February 2025

Tuesday, 25 February 2025

What can Azure AI Services do for you?

AI is revolutionizing industries worldwide. From automating customer service with ChatGPT to enabling early disease detection, the possibilities are limitless. But how can businesses unlock this power without employing expensive AI? That’s where Microsoft Azure AI services comes in.

Business leaders are rightly focused on results, not building AI models. Azure AI services let you integrate powerful, pre-built AI capabilities into your business applications without requiring a team of specialists or a deep knowledge of machine learning.

Azure AI services offer pre-built, easy-to-deploy AI models that can be provisioned through the Azure portal, just like adding a database or virtual machine. Capabilities like image analysis and document intelligence are now available to anyone who can see an opportunity to improve a business process. Azure AI services make it much easier to turn a good idea into a better business.

Azure AI Services include Language, Speech, Vision, Content Safety, and more. For example, Azure AI Language enables sentiment analysis, key phrase extraction, and summarization—helping you better understand customer feedback and improve decision-making. Some Azure AI services can be customized, allowing you to develop innovative solutions to address specific challenges. You might want to improve customer experiences or streamline operations. And with a pay-as-you-go pricing model, building a proof-of-concept application is low-risk.

Azure also provides a comprehensive suite of tools—such as storage, app services, and containers—that support AI-powered application development. Now, instead of needing AI experts, we can use Azure software engineering skills to create powerful AI solutions.

With Azure AI, businesses of all sizes can tackle challenges and seize opportunities in ways that were once only possible for larger organizations.

If you are ready to explore the potential of AI in your business, contact us now to find out how we can help. We have the Azure software development skills needed to incorporate Azure AI services in ways that could transform your operations and drive innovation.

Monday, 17 February 2025

Retrieval Augmented Generation

I’ve got a riddle for you. When is ChatGPT not the best thing since sliced bread? Answer: When it makes things up!

What? You didn’t think that was funny? Well, if getting things wrong is a big deal in your work, you can be forgiven for not rolling on the floor laughing.

There are many industries where it’s important to get things right such as software development, renewable energy, science, medicine, education, government, or any other knowledge-intensive field. In all these areas, people depend on having timely and correct information to do their work and advise their customers.

What are “hallucinations” in LLMs?

Large language models (LLMs), such as ChatGPT and others, have revolutionised the way we interact with information. But despite the impressive communication skills of LLMs, they do have some limitations, such as making things up, also known as hallucinating. It may not happen very often but for some businesses and organisations, it's a big deal.

Hallucinations include generating information that is plain wrong, providing information that’s not relevant, or attributing things incorrectly. But why do they happen? And are the LLMs actually making things up?

LLM hallucinations often occur because of gaps in the training data. When the question is specialised, relating to a niche area, or outside the time period of data for the training data, the LLM may lack the domain-specific sources and so helpfully does the best it can.

Sometimes the response is obviously wrong, but often the response is highly plausible. And if the person requesting the information doesn’t have specialised knowledge, they may well think the answer is correct. This is the point where the riddle becomes even less funny than it was to begin with. And it wasn’t very funny even then ….

How Retrieval Augmented Generation (RAG) helps

RAG is the process of optimising large language models (LLMs) for specific knowledge-intensive tasks. The “augmented” in RAG is the process of supplementing the data available to the LLM with additional, domain-specific data. This may be internal to an organisation, or specialist data for a particular sphere of knowledge. It might include searching databases and selected documents. It means that the LLM has access to more recent and more specialist data. Which means that hallucinations become less frequent, and relevance and accuracy are increased for that particular area. And by making verified company-specific data available to the LLM, there’s massive business benefit.

Why RAG matters to business

As our world becomes more complex, so does our need to manage information. In industries like software development, manufacturing, or renewable energy, accurate and reliable information is crucial. By integrating RAG into your business processes, you’re giving your teams better data for decision-making, customer service, and product development. RAG provides a way for organisations to improve their products and services by harnessing AI in a way that’s tailored to their needs.

Ready to leverage RAG in your business?

The first step in creating a RAG solution is identifying and collecting relevant content ready for indexing. If you’d like to discuss a proof-of-concept project for your business, we’ve got the relevant skills to tailor a solution to fit your specific needs. Get in touch to see whether our skills are a good fit for your goals.

Wednesday, 12 February 2025

Is your data AI ready?

Intelligent data: what it is, and why it matters

I know a lot of people are wary of AI. The idea that software might make better decisions than lawyers, scientists, or doctors takes a bit of getting used to. But AI is already improving the world in wonderful ways.

AI is processing images to improve scientists’ understanding of wildlife populations. Advanced research shows that AI could do a better job than some doctors of identifying eye problems. And AI is already analysing documents to provide lawyers with relevant case histories.

None of this is science fiction. These are real projects that are reducing costs, improving accuracy, and removing repetitive tasks from highly skilled people, freeing them up to do more valuable and interesting work.

Whilst we marvel at the intelligence and friendliness of ChatGPT, many businesses executives are trying to figure out how best to use AI to stay competitive. The irony is that answer may be surprisingly simple.

Learn about AI: what it really is, how it works, and why it matters in your industry.
Bring data together: AI works best on large volumes of trustworthy data. That means you need to bring disparate data sources together into one unified and secure data platform.
Get your data in shape. Clean it up. Match different data sources. Remove duplicates. Have an audit trail so you can trust the output. Add meta data.
Rinse and repeat. Put processes in place so new data is added and cleaned ready for analysis.
Make it visual. Microsoft Power BI has AI capabilities that make it much easier to visualise data, and to find data insights.

AI is a disruptor, there can be little doubt about that. But figuring out how it will disrupt specific industries is more difficult to predict. But businesses who use data to uncover insights, and help their people make better decisions are outperforming those who do not. But everyone has to get started by understanding what the opportunities might be, and prototyping the possibilities.

Microsoft and the Azure intelligent data platform is a leader for Analytics and Business Intelligence platforms. From Power BI, Microsoft Fabric, to Azure SQL Database and CosmosDB, there’s a data platform to meet data needs large and small.

If you are thinking about your next steps into an AI-powered world, get in touch to see whether our data engineering experience could help.

Thursday, 6 February 2025

Would your data win a gold medal?

Whether you work for a global enterprise or a local business, data quality is vital for analysis and reporting. I’ve talked about this before in How to Improve Data Quality and Designing for Data Protection but today I’m exploring a structured approach to data quality known as the medallion architecture.

Data can come from a variety of systems, including legacy systems that have been used for many years. Some well-designed relational database systems will have tight constraints to help improve data quality, but others may have a lot of errors. You therefore need a way of getting your data into a format that can be trusted by decision-makers.

This architecture comes from the world of big data and data lakes, but the ideas behind it are useful for all data projects. It’s called the medallion architecture, and it processes data in three distinct stages: Bronze, Silver, and Gold—just like Olympic medals. Clever, right?

Bronze layer: raw data

Data from source systems are imported “raw” without making any changes to the data. The purpose of this stage is to:

• Validate import integrity. Ensure no data are missing, the original schema has been preserved, data has not been corrupted, etc.

• Add meta data. Columns are added to identify the import date and time, originating system etc.

• Provide an audit trail. This bronze layer data is not modified, and so can be used to validate queries that emerge in later stages.

• Avoid re-importing. This initial stage might not be the most glamorous, but it provides the foundation for everything that follows and needs to be done with care. You want to avoid reimporting the data if problems emerge later.

Data are appended to the bronze layer periodically, and so files will increase in size over time. Data in the bronze layer is never accessed directly by business users, data scientists, or analysts. Instead, it forms the foundation for the silver and gold layers.

Silver layer: clean data

The silver layer uses data from the bronze layer and is never created directly from source data. The purpose of the silver layer is to:

• Clean the data. Fix issues such as missing or null values, deduplicating, dealing with out-of-range values, data types, normalization, and other data quality issues.

• Validate the data. Check no errors have been introduced by comparing and testing against the bronze layer.

• Normalize the data. Data may be split separate tables reading for processing at the gold layer.

Data is typically not aggregated at the silver layer but if aggregation is done, at least one non-aggregated record is preserved. Data at the silver layer might be used by data scientists or analysts. Business users would normally have access to the gold layer.

Gold layer: business-ready data

The gold layer is where data becomes business-reporting-ready. At this stage:

• Data is denormalized and aggregated according to business needs.

• Data models and measures are created, in line with how users want to query and analyse the data.

The gold layer is focussed on optimizing the data for business intelligence reporting. Data presented in a format suitable for business users to work with tools such as Power BI to create dashboards and reports.

Data-led decisions need gold-standard data

I may be overdoing the Olympic theme, but the concepts behind the medallion architecture are now considered best practice. It is built on good data practices that actually work. In summary:

1. Separate data ingestion and validation from later stages. Preserve this “raw” version of the data to create a base from which to validate future processing.

2. Manage data quality issues as an intermediate step before aggregating, modelling or creating measures.

3. Optimize for business use. Create aggregations and measures suited to business reporting requirements at the final stage.

If you are tackling a reporting project, moving data to the cloud, or need help improving your data’s reliability, get in touch for a chat. We have decades of experience in improving data quality and data manipulation. Together we could make a winning team to turn raw data into golden insights (alright, alright, enough).