Collaborative projects
Since our inception, we have been working hard on exciting research, development and innovation collaboration projects.
Similarity of textual and image luxury fashion product data
Avoir Fashion Limited will be the comparison destination website for luxury fashion. Customers can search, compare and shop the world’s greatest stores and designers in one place.
The project aims to facilitate the automation of the current manual process of matching luxury fashion product data from multiple data channels. Specific aims are to identify and implement machine learning algorithms to enable product similarity via textual information and image verification of the fashion products to determine the least expensive item from a number of different retailers.
Keywords: Textual and image similarity; Interpretability; machine learning; data normalisation.
Status: Complete
Customer feedback and digital content analysis for new online women’s football magazine
EatSleep Media create high quality videos and podcasts that deliver upon the strategic aims of their clients whilst being tailored to the customer requirements and targeted to key audiences online. Just one of EatSleep Media’s impressive client base is the Football Association of Wales, who need to regularly communicate with their audience delivering entertaining and informative content that showcases Welsh football from domestic to international, courses and conferences to events and heart warming stories. ESM developed the FC Cymru magazine show and brand to deliver upon those strategic aims, but to also give fans, volunteers, clubs and others in the Welsh football family a voice and a product that represented them.
EatSleep Media are launching a new online women’s football show called Ballers. This project aims to entertain, inform, inspire and empower young women through football content. ESM want to better understand the audience and their needs to ensure it is launched in the most impactful and successful way.
Collaborating with the DIA, the aim of the project is to develop a data-driven formula from social media analysis and content feedback, which will help EatSleep Media to engage with their core audience. This will enable them to improve market penetration and to find new business opportunities through marketplace visibility.
Keywords: Social media analysis
Status: Complete
Incorporating predictive analytics into Health and Her data collection practices
Health and Her provides trusted, qualified and expert information on female health as well as a carefully curated selection of products recommended by women who have been through the menopause. The Health and Her approach uses the biopsychosocial model, which looks at biological, psychological and social health, and the company’s in-depth research with women has led to it being ranked fifth in BusinessCloud’s Wales Tech 50 for 2020.
The aim of the collaboration with the DIA is to use predictive analytics to help Health and Her move to the next stage of their data collection journey and provide a huge number of insights. The information collected will help Health and Her make more accurate predictions and help their customers receive a more personalised service and help women prepare for a better menopause.
Keywords: Predictive analytics, Data visualisation
Status: Complete
Comparative and predictive analysis of vessel condition
With over one hundred years of heritage and an ethos of quality, innovation, and integrity, Idwal are the world leaders in ship inspections and commitment to raising standards within shipping. They revolutionised the industry with the first digital inspection framework and online platform, allowing their clients to view and understand the condition and risk of their maritime investments better than ever before, with greater visualisation, clarity and decision making.
Leveraging a network of over 250 international marine surveyors, Idwal Marine are able to conduct an inspection at any port in the world and capture several hundred individual data points. These are collated, processed, and reviewed back at their Cardiff Headquarters, with a full report and conditional grade issued. The DIA will identify key characteristics that affect a vessel's condition over time and determine if these characteristics can be used to construct a comparative grade using predictive analytics.
Keywords: Predictive analytics, comparative analytics
Status: Complete
Optimisation of truck load capacity for ugly freight
The P&A Group of companies is a family-run business with a long-standing timber heritage. The Morgan family is proud to have operated thriving timber yards for over five generations.
Based In Mold (North Wales), the P&A Group of companies is committed to operating as a supplier, customer and employer of choice; being a progressive and sustainable organisation; providing a positive and safe environment where employees can prosper; and positively contributing to the communities in which they operate.
P&A group aims to improve the load fill percentage of outbound articulated lorry trailers delivering the project goods to customers by applying optimisation techniques and constraint programming. Using these methods, we aim to generate load ordering instructions to improve delivery capacity per truck. With even a modest improvement to load capacity, this would create a substantial saving in operational costs and reduce the carbon footprint of the company.
Keywords: Data pre-processing, exploratory data analysis, optimisation techniques
Status: Complete
Automatic aggregation and categorisation of textual data based on topics
Simply Do Ideas is a cloud-based, digital platform enabling organisations to capture, evaluate and support challenge-led innovation. Use cases range from employee idea capture (closed innovation) through to fully managed, global innovation marketplaces (open innovation). Current enterprise software-as-a-service (SaaS) licences are with organisations including Rolls-Royce, the Intellectual Property Office and the NHS.
Our target customers are organisations looking for internal and external solutions to their toughest business challenges. To solve these problems, we deploy our powerful, global, innovation marketplace constructed of SMEs, academia, and large organisations to provide disruptive solutions.
Working with the DIA, the project aim is to develop a reliable, automated solution enabling Simply Do Ideas to aggregate and categorise their textual data using machine learning, natural language processing and topic modelling techniques. This will reduce the time and effort associated with manual categorisation approaches.
The project would involve identification and deployment of suitable machine learning algorithms to enable topic modelling of textual data contained.
Our data scientist, Lowri Williams, has written a blog titled 'Topic Modelling: Going Beyond Token Outputs' which stems from this collaboration.
Keywords: Machine learning; Natural Language Processing; Textual similarity; Topic modelling
Status: Complete
Building energy demand predictions using artificial intelligence time series forecasting techniques
Accurate energy-demand forecasting provides data for designing efficient decentralised energy networks, a vital element of the transition towards a more sustainable energy system and the fight against climate change.
Sustainable Energy Ltd was formed in 1998 to provide independent consultancy in the renewable and low-carbon energy sector and has been at the forefront of developing decentralised low-carbon energy networks across the UK.
Part of the work undertaken by Sustainable Energy requires the running of heat and electricity consumption models to predict energy demands in individual buildings, however, when assessing and designing solutions to decarbonise whole cities these forecasts need to be accurate and able to quickly predict future trends for a range of different building types based on both a long-term (daily consumption data) and short-term (hourly consumption data) basis.
This was not previously available to Sustainable Energy and the development of an artificial intelligence model that would incorporate live monitored data, open data sets and meteorological data was identified as a solution for creating data sets for analysing and designing solutions for city-wide decarbonisation. Sustainable Energy provided data and knowledge and worked closely with the DIA to implement artificial intelligence into a smart energy model.
Keywords: Low carbon, decentralised energy, renewable heat, energy consumption; electricity consumption; time series forecasting; building types; meteorological data; seasonal auto regressive integrated moving average; long short-term memory (LSTM) models; random forest; generalised additive model.
Status: Complete
Improving the efficiency of in-lab testing and prototype product build quality by utilising data science
This is an exciting collaboration between the innovative Cardiff based technology company Sure Chill, and Cardiff University's Data Innovation Accelerator, which assists companies to harness cutting edge data science techniques to improve commercial products, processes and services.
Sure Chill develops cooling technology which is used in the medical cold chain to safely deliver vaccines worldwide, without the need for costly and unreliable rechargeable batteries. Sure Chill has produced a platform cooling technology which harnesses a unique property of water to enable continuous cooling from an inconsistent power grid. Regular cooling systems need constant power to maintain the temperature, Sure Chill technology can maintain a temperature between 2°C to 8°C for more than 14 days after the power stops.
Using World Health Organization statistics, it’s estimated that, in 2018, Sure Chill vaccine refrigerators in the field helped to store over 36,000,000 safe vaccinations. This success has led to multiple industry awards, most recently Barclays’ Innovation Award 2018 and a UK Department of International Trade Board Of Trade Award for exporting.
Sure Chill is currently looking at adding additional tools to the preexisting predictive analysis toolbox. Use of data science models will help the company to gain confidence in the design and to drive down development costs, reducing the test duration during new product development.
The aim of this project with the DIA is to support The Sure Chill Company by improving the testing and new product development cycle via research, development and evaluation of appropriate data science methods for implementing factorial experimental design and synthetic time series generation under laboratory conditions.
Keywords: Synthetic data, time series generation, factorial experimental design.
Status: Suspended
Using graph databases to build recommender engines to improve the functionality of the Stratigens product
Talent Intuition is changing the way that people decisions are made. They achieve this by helping companies shape strategy and reduce risk by making external human capital data as readily accessible as business intelligence.
In the past, staff at Talent Intuition (TI) have relied on more manual processes to identify and recommend locations for businesses to set up new premises/operations, based on skillsets of local candidates for example. This is very time consuming. The collaboration with the DIA will use (i) Natural Language Processing techniques to transform the unstructured data to structured data and (ii) graph databases to build a recommendation engine to make stronger recommendations on the best strategies and options to support a business in its strategic thinking in the short and medium term as well as longer term.
Keywords: Natural Language Processing, Graph Databases, Recommendation Engine
Status: Ongoing
Creating a finance-related pre-training model to predict events
Founded in 2018, Talent Ticker Limited is a successful artificial intelligence (AI) recruitment market intelligence platform. Talent Ticker currently uses various data science techniques as well as a team of researchers to annotate news articles and vacancies in order to provide market intelligence for its clients interested in staffing trends.
The current focus has been on using AI to increase the amount of content it is able to deliver by automating the annotation process as much as possible, for example correctly deriving the companies involved, locations, event topic, hiring managers and so forth.
Collaborating with the DIA, the goal is to form finance-related word embeddings relating to particular events. The DIA and Talent Ticker aim to create a finance-related BERT (Bidirectional Encoder Representations from Transformers) by using and modifying BioBERT, a pre-trained language representation model for the biomedical domain to fit around finance terminology. Along with the labelled data, this would form a finance-related model which could ultimately be used to predict events such as funding rounds.
Keywords: Word Embeddings, Bidirectional Encoder Representations from Transformers, Predictive Analytics
Status: Complete
Accurately verifying an individual’s identity supports a company’s fraud prevention strategy while maintaining their regulatory compliance
W2 provides real-time solutions that simplify the global regulatory compliance requirements for mobile and digital transactions via a single API integration. Supporting businesses in the financial, e-commerce, betting and gambling sectors, W2 gives access to innovative solutions and products that reduce risk, combat fraud, facilitate identity verification and digital onboarding and; are foremost customer-centric to ensure businesses stay compliant whilst achieving competitive advantage and higher customer retention and acquisition. Solutions include: Know Your Customer (KYC), Identity Verification (ID), Anti-Money Laundering (AML), Fraud prevention checks, Credit and Income Checks, and Know Your Business (KYB) reporting.
The project will determine the most appropriate natural language processing techniques to accurately match inaccurate or variations in customer information. Model scores and thresholds will be evaluated to determine the optimal remediation recommendations to the client. To avoid the sharing of Personal Identifiable Information (PII) we will be investigating the use of differentially private synthetic data generation to generate representative data sets for scenario modelling.
Keywords: Synthetic data; Fuzzy matching; Interpretability; Performance; Guided remediation.
Status: Complete