Natural Language Processing

The purpose of natural language processing is to create a computer system that analizes and understands the human language and also generate it.
To do this a technology to enter the human language, a technology to interpret the input language, and the technology to apply parsed language are required.
The application for this technology is limitless (including but not limited to the automatic translation, automatic syntax generation, interaction with machines, DB Query, Search and etc)
The primary application of the division is the syntactical analysis. The secondary application is being able to do the semantical analysis.
An appropriate analysis technique according to context is also required.

  • Image Processing

    Step 1 of the natural language processing is accepting the user's input.
    In addition to the keyboard input, microphone, scanners, cameras and video input is also required.
    In order to handle video inputs, image processing should happen first and becomes the initial operation for the character recognition.

  • Character recognition

    There are three steps to the character recognition
    The first step is a preprocessing operation for OCR. These are techniques for extracting the text area, adjusting contrast, and recognizing words.
    The second step is a part of the OCR technology which converts a selected area into text.
    The third step is a post-processing of OCR results. These include error checking, error correction, and text conversion technology.

  • Sectional Analysis
    1. Syntactic Analysis
      A natural sentence usually contains a multiple words. There are linking relationships between each word according to its role in the sentence.
      Therefore, in order to process a natural language, you need to analyze each member of the sentence and the relationships between each member.
      This process is called structural analysis phase.
      In this process we break down the sentence into the smallest meaningful chunk (a morpheme) and analyze the role of each chunk.
    2. - Split sentence
      It is the process to split paragraphs and documents into sentences.

      - Morpheme Analysis
      The smallest meaningful chunk of the sentence is called morpheme. This process splits a sentence into morphemes.
      In order to do morpheme analysis, a morpheme dictionary, morpheme grammars, and other statistical data are used.

      - Syntax analysis
      Each morpheme is analyzed to figure out what its role is in the sentence.
      In this step, the subject, the verb, and the object of the sentence are recognized. Modifiers and what they are modifying are also recognized.
      The goal is to grasp the overall structure of the sentence.

    3. Semantic Analysis
      Semantic Analysis is the step to understand the meaning of the sentence.
      It includes WSD (Word Sense Disambiguation) and Substitution processing.

      - (WSD:Word Sense Disambiguation)
      A word can have more than two meaning
      For example 'nail' can be something used with a hammer or part of your finger or toe.
      Humans are able to judge how it is used through context but for the computer it is not easy.
      Therefore, a step to eliminate ambiguity is needed.
      This step is called WSD.
      WSD uses usage restriction between words and concurrences.

      - Substitution processing
      In documents written by people or in everyday conversations, there are many pronouns which point to the nouns that are used before.
      These are called substitutes.
      To grasp the meaning of the sentence, we need to figure out what these substitutes are replacing. This step is called substitution processing and it`s an important research area.