Posts

Blog Posts

Natural Language Processing:- A Beginner's Introduction

Image
Originally Written on:-  11th May, 2019. Natural Language Processing There are a few concepts that are absolutely essential for NLP. Only a few has been discussed in this blog post This blog consists of a few parts:- stemming and lemmatization TF-IDF in NLP Cosine Similarity. Stemming Stemming algorithms work by cutting off the end or the beginning of the word, taking into account a list of common prefixes and suffixes that can be found in an inflected word. This indiscriminate cutting can be successful in some occasions, but not always, and that is why we affirm that this approach presents some limitations. Below we illustrate the method with examples in both English and Spanish. Stemming refers to a crude heuristic process that chops off the end of words in the hope of achieving this goal correctly most of the time and often includes the removal of derivational affixes. Overstemming and Understemming However, because stem

TextBrain:- Building an AI startup using Natural Language Processing

Image
Originally Written on:- 28th May, 2019. Overview This is the blog post  for TextBrain, a tool that automatically grades and validates texts. In order to validate texts, it uses the copyleaks API to check for plagiarism. It also uses a modified version of GPT-2 to detect the likelihood that the text was real or fake. Then it outputs a validation score using these 2 scores. In order to grade the text, it uses a neural network model trained on the automatic essay/text grading dataset on Kaggle found here  . Steps in this Tutorial:- Step 1:- Download and run GPT-2, hopefully already wrapped as a flask file. Step 2:- Analyse its structure Step 3:- Re-Design and add some text and stuff Step 4:- Design a Login and Sign Up functionality Step 5:- Integrate a Copy-Leaks API Step 6:- Integrate Tensorflow.js Step 7:- Train and transfer Scikit-Learn model on automatic essay/text grading dataset. Step 8:- Display scores Step 9:- Implement a payment Funct