Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Long Sequences Transformers: a review of the SOTA

less than 1 minute read

Published: on by Achraff Adjileye

A lot of work has been done on processing long documents, lifting the limitation encountered by BERT-like models which are only capable of processing sequences up to 512 tokens. This has led to the release of several variants of these models to process long documents. The main idea of most of these models is to make the attention mechanism of Transformer (see Attention is all you need) scale linearly with the input sequence length instead of quadratically, in terms of time and memory complexity.

6 Reasons why you should adopt MLOps?

less than 1 minute read

Published: on by Subaandh Sambharathan V K

Artificial Intelligence adoption in enterprises is growing steadily. According to a recent survey, 35% of companies reported using AI in their business, and 42% are reportedly experimenting. Increasing AI adoption requires maintenance and monitoring of the Machine Learning models. Machine Learning Operations (MLOps) is a set process that aims to track, deploy and monitor Machine Learning models in production.

Behind the scenes of Dilitrust’s Machine Learning team

less than 1 minute read

Published: on by

Dilitrust is a SaaS solution for contract management. Since its creation, Dilitrust has been developing its own artificial intelligence. It is thanks to this AI that we can offer our service. It analyses contracts and extracts the important data they contain to facilitate your daily work.

Training state of the art french language model on legal contract

less than 1 minute read

Published: on by Ahmed Touila

The majority of natural language processing modules leverage a certain technique for text representations on character, word or sequence level (sentence, paragraph or document). As a result, the efficiency of these modules is highly dependent on the quality of the embeddings they are built on.

How does image analysis work in contract management?

less than 1 minute read

Published: on by Romain Vial

In the contract analysis process developed by Dilitrust, everything often begins with image analysis. Indeed, the majority of the documents we process are scanned documents in which the text is not directly accessible. It is therefore necessary to go through an image analysis stage which aims to solve the following problems:

events

Machine Learning Applied to Legal Practice

Published: on by Romain Vial

Dilitrust is a contract analytics and management solution powered by artificial intelligence. Dilitrust helps companies manage and make the most of their contract portfolio by identifying relevant information and data to manage key contractual commitments during the whole life of the contract. Our technology rests on a combination of specifically trained Natural Language Processing (NLP) algorithms and advanced machine learning techniques.

Traitement Automatique du Langage sur du texte

Published: on by Alexis Agahi

A travers cette présentation vous verrez les différents mécanismes de traitement automatique du langage (ou NLP en anglais) appliqués au texte. Nous ferons un état de l’art sur la comprehension et l’extraction d’informations en commençant par les concepts simples, puis leur évolutions majeures avec le Deep Learning, pour finir avec les dernières publications fin 2018 avec l’apprentissage non supervisé pour la production de modèles de langage, qualifié par certains, de revolution dans le domaine.

What Makes A Great Software Engineer?

Published: on by Alexis Agahi

Is it your technical skills, your expertise, your social soft skills or your teamwork that make you a great software engineer? The next decade might completely change the perspective considering all significant progress that has been made in the machine learning field. So will the future of software engineers depend on our ability to describe a problem for a machine? Does it mean that we need to think differently about our team and expertise?

Comment les machines comprennent-elles le langage humain ?

Published: on by Hicham El Boukkouri & Ahmed Touila

Le Traitement Automatique du Langage (TAL) alimente de nombreuses technologies modernes telles que les chatbots ou la détection de spam. À la base du TAL se trouve le besoin de comprendre du texte—les octets bruts ne permettent pas d’extraire un réel sens. Cela justifie alors l’utilisation de représentations sophistiquées que nous introduisons ici: comptage (ex. TFIDF), vecteurs statiques (ex. word2vec) et réseaux de neurones (ex. Transformers).

The future of Artificial Intelligence is contrastive

Published: on by Ahmed Touila

Applying AI to real world problems in a supervised fashion requires large set of data covering all aspects of the problem we are solving. In reality, it’s impossible to build such dataset thus giving our model limited understanding of the problem leading to absurd predictions. The solution is Contrastive Learning which learns similarities and differences by contrasting samples against each other therefore limiting the need for human supervision.

Measure carbon emissions of ML projects, in python

Published: on by Amine Saboni

The environmental cost of ML has been increasing in recent years, due to its ever-increasing adoption in many software applications, as well as their size and complexity. Many parameters impact the carbon cost of a python program, such as the hardware used and the location of its execution. In this talk, we present the internals of Code Carbon, a library developed to better estimate the impact of ML projects, throughout their global life cycle.

publications

research