Abstract
Patent offices worldwide receive considerable numbers of patent documents that aim at describing and protecting innovative artifacts, processes, algorithms, and other inventions. These documents apart from the main text description may contain figures, drawings, and diagrams in an effort to better explain the patented object. Two main directions are presented in this chapter; concept-based and content-based patent retrieval. Concept-based search utilizes textual and visual information, fusing them in a classification late fusion stage. Conversely, content-based retrieval is based on the shape/content information from patent images and is therefore based on the visual descriptors that are extracted from binary images. Concepts are extracted using classification techniques, such as support vector machines and random forests. Adaptive hierarchical density histograms serve as binary image retrieval techniques that combine high efficiency and effectiveness, while being compact and therefore capable of dealing with large binary image databases. Given the vast number of images included in patent documents, it is highly significant for the patent experts to be able to examine them in their attempt to understand the patent contents and identify relevant inventions. Therefore, patent experts would benefit greatly from a tool that supports efficient patent image retrievalimage retrievalpatent and extends standard figure browsing and metadata-based retrieval by providing content-based search according to the query-by-example paradigm.