Case Studies

Future Combat Air Sensor Fusion

Situation: Leonardo were looking to explore novel approaches to sensor fusion to inform the sixth generation Global Combat Air Programme.

Problem: The GCAP concept calls for a connected mix of crewed and uncrewed air platforms carrying a diverse array of sensors and operating in dynamic mission profiles. Understanding the vast quantities of data collected to present pilots with actionable insight is a significant challenge.

Solution: We explored a wide range of multi-platform, multi-sensor fusion architectures and techniques, assessing relative target classification performance. We also explored the non-functional properties of the architectures to expose the trade space across the GCAP systems of systems, and the implications for future capability generation.

Explanation: Fusion can be performed on data extracted at various points in the processing pipeline. This could be raw sensor output, intermediate features or single modality classifications.

Our work explored the relative classification  performance of fusing different levels of information.

Pure performance cannot be the only consideration for deployable capability. What are the computational and datalink requirements of an architecture? How robust is it to failure or changes in platform availability? These were just some of the trade-offs we sought to highlight.

Neurosymbolic Person Re-ID

Situation: Our customer was interested in how to find and maintain a track on a person of interest within a crowd from a UAV.

Problem: Due to the dynamic, aerial nature of UAVs, re-ID is a difficult task. No single computer vision model provides a robust solution to the challenge.

Result: In testing, our system significantly outperformed the component models, identifying people of interest with very few false positives.

Explanation: We quickly developed an MVP for the customer to test individual components, then worked with them in an Agile manner to focus on those that showed most promise.

We collected several bespoke datasets to train models and test against, filling gaps in the open source datasets.

Due to the multimodal nature of the models, we can use different input triggers – we can use a picture of a person from that day, a text description of them, or an archive photo of their face.

The system is designed to be extensible, allowing for models to be added or removed to fit the task at hand.

LLM Content Detection

Situation:  Dstl looked to find ways to distinguish LLM and human generated content in media and social media.

Problem: The performance of modern LLMs means that detection is becoming increasingly more challenging and nuanced, conventional ML approaches are struggling to perform adequately.

Solution: Integrate context through contrastive learning using extracted metrics from the text and Siamese Networks with triplet loss.

We developed two approaches using novel methods to create a tool that detects over 80% of LLM content and can adapt to different domains, far exceeding existing tools like GPTZero and LLMDet.

Explanation: We used contrastive learning, with our novel subanalytics method, to develop two models which worked in concert with each other.

Our subanalytics uses the metadata, features and extracted metrics from the data to form a profile of the text. This profile can be hugely discriminative.

We also developed a model which uses the raw embeddings of the text.

Each model takes the context of some text and uses a Siamese network to optimise the embeddings to maximise the detection capability. These work in concert to give the user the greatest confidence in the assessment.

Self-Supervised Sonar Classification

Situation: Navy Digital’s FASTER programme wants to accelerate submarine technology adoption.

Problem: Modern AI/ML solutions require a lot of data. Sonar data is a specialist subject, making collection and labelling large volumes difficult.

Solution: Self-supervised Sonar Classification Capability using auto-encoders, contrastive learning and self-labelling.

We experimented with several self-supervised learning approaches, beating our baseline supervised performance by up to 25% at the same number of training samples.

 Explanation: We worked with the customer to deliver  a Docker container, enabling direct integration with their wider system.

Self-supervised learning works by solving a proxy task to learn feature representations. This task either uses no labels, or labels that are easy to generate (e.g. metadata)

These can then be used to train a supervised classifier on the actual task using far fewer labelled samples.