{"id":448,"date":"2020-03-03T02:42:26","date_gmt":"2020-03-03T02:42:26","guid":{"rendered":"http:\/\/freedville.com\/blog\/?p=448"},"modified":"2020-03-03T02:42:26","modified_gmt":"2020-03-03T02:42:26","slug":"cognitive-system-testing-natural-language-processing-unit-testing","status":"publish","type":"post","link":"https:\/\/freedville.com\/blog\/2020\/03\/03\/cognitive-system-testing-natural-language-processing-unit-testing\/","title":{"rendered":"Cognitive system testing: Natural language processing unit testing"},"content":{"rendered":"\n<p>Part 4 of the\u00a0<a href=\"http:\/\/freedville.com\/blog\/2016\/12\/04\/cognitive-system-testing-from-a-to-z\/\"><strong>Cognitive System Testing<\/strong>\u00a0<\/a>series, originally posted in 2016 on <a href=\"https:\/\/developer.ibm.com\/\">IBM Developer<\/a>.  <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Introduction<\/h2>\n\n\n\n<p>\nNatural Language Processing (NLP) is the way cognitive systems\nextract meaningful information out of plain text. NLP is part art and\npart science, and as such may seem difficult to test using\nautomation. In this chapter and the next chapter I will describe how\nwe test our NLP pipelines. Again referring to the&nbsp;<a href=\"http:\/\/martinfowler.com\/bliki\/TestPyramid.html\" target=\"_blank\" rel=\"noreferrer noopener\">test\npyramid<\/a>&nbsp;we will test at the unit, functional,\nand system levels. Today\u2019s chapter focuses on unit-level NLP\ntesting.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Definition<\/h2>\n\n\n\n<p>\nA natural language processing pipeline is made up of several\ncomponents. In several parlances, including the&nbsp;<a href=\"https:\/\/uima.apache.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Apache\nUIMA framework<\/a>, these components are called\n\u201cannotators\u201d, since they annotate a span of text as having\nmeaning. This annotation includes an annotation type, a span\n(\u201ccovered text\u201d), and potentially other attributes, enough to\ntell you what text was interesting, where it was located, and why it\nis interesting. An NLP unit test will focus on a single annotator and\ndetermine if it correctly annotates one aspect of a given text.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How to build an NLP unit test suite<\/h2>\n\n\n\n<p>\nGenerally, as you build a series of NLP annotators, you work with\nexample snippets of text and decide how you can train your NLP system\n(either with rules or machine learning) to properly annotate as many\nof these texts as possible. As you find more text snippets that\nexpress a target concept in varying ways, you will continually adapt\nyour NLP to handle them. It is important to capture each target\nvariation in a test case, so that as you add more variations, you can\nverify that function is not regressed.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Worked example<\/h2>\n\n\n\n<p> Let\u2019s pretend we want to write some NLP code to extract instances of dogs from blocks of text. We start with the sentence \u201cI have a dog\u201d. Our first version of the annotator is exceptionally naive:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>for(word in sentence):\n  if(word == \u201cdog\u201d) then annotate Dog<\/code><\/pre>\n\n\n\n<p>\nWe record a test case to verify output like \u201cI have a&nbsp;<strong>dog<\/strong>\u201d.<\/p>\n\n\n\n<p> We find another sentence \u201cDogs are great\u201d. We add \u201c<strong>Dogs<\/strong>\u00a0are great\u201d to our test suite. We also find a sentence \u201cThe corgi played with the ball.\u201d Corgis are dogs too, so we\u2019ll add \u201cThe\u00a0<strong>corgi<\/strong>\u00a0played with the ball.\u201d We create a dictionary of dog-related terms called DogDictionary (this exercise left to the reader), and we update the annotator as follows:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>for(word in sentence):\n  if(word in DogDictionary) then annotate Dog<\/code><\/pre>\n\n\n\n<p> Finally, we find an example of a text we do NOT want annotated. In the sentence \u201cI was dog tired after work today\u201d, we do not want to annotate any word in this sentence as there are no literal dogs mentioned. We add \u201cI was dog tired after work today\u201d to our test suite, with an indication that the text should contain zero annotations. Our annotator is now:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>for(word in sentence):\n  if(word in DogDictionary\n  and word.part_of_speech=noun) then annotate Dog<\/code><\/pre>\n\n\n\n<p>\nYou can imagine after exploring more and more sentences the annotator\nwill become increasingly complex. It is important to maintain the\ntest suite as new patterns are discovered.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Importance of positive and negative tests<\/h2>\n\n\n\n<p>\nNatural language processing has two types of errors:<\/p>\n\n\n\n<p>\nFalse positives, text was annotated that should not have been\nannotated (affects precision, the measure of how many annotations are\ncorrect)<\/p>\n\n\n\n<p>\nFalse negatives, text was not annotated but it should have been\n(affects recall, the measure of how many true instances were\nannotated)<\/p>\n\n\n\n<p>\nImproving NLP accuracy is a careful dance of reducing these two\ncomplementary kinds of errors. An overly-aggressive annotator will\nhave high recall and low precision, while an overly-passive annotator\nwill have high precision and low recall. Get into the habit of\ncollecting representative examples for each error you fix and you\nwill be able to improve both measures.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>\nNatural Language Processing is part art, part science, but that does\nnot mean it can\u2019t be tested with automation. NLP can be tested at\nboth the unit and functional level. When testing at the unit level,\ncollect examples of text that you want (and don\u2019t want) to receive\na certain type of annotation. Functional level NLP testing will be\ncovered next.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Part 4 of the\u00a0Cognitive System Testing\u00a0series, originally posted in 2016 on IBM Developer. Introduction Natural Language Processing (NLP) is the way cognitive systems extract meaningful information out of plain text. NLP is part art and&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/freedville.com\/blog\/wp-json\/wp\/v2\/posts\/448"}],"collection":[{"href":"https:\/\/freedville.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/freedville.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/freedville.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/freedville.com\/blog\/wp-json\/wp\/v2\/comments?post=448"}],"version-history":[{"count":2,"href":"https:\/\/freedville.com\/blog\/wp-json\/wp\/v2\/posts\/448\/revisions"}],"predecessor-version":[{"id":451,"href":"https:\/\/freedville.com\/blog\/wp-json\/wp\/v2\/posts\/448\/revisions\/451"}],"wp:attachment":[{"href":"https:\/\/freedville.com\/blog\/wp-json\/wp\/v2\/media?parent=448"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/freedville.com\/blog\/wp-json\/wp\/v2\/categories?post=448"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/freedville.com\/blog\/wp-json\/wp\/v2\/tags?post=448"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}