dtSearch releases Version 7.01 with enhancements, fixes

Enterprise Networks & Servers, Jul 2005

dtSearch Corp., a supplier of enterprise, end-user and developer text retrieval software, released Version 7.01 of the dtSearch Product Line June 14, almost a month after dtSearch 7.

Enhancements in dtSearch Web includes a generated search form that has more flexible stylesheet references, and allows file parsers to generate HTML output using "em" units for font sizes instead of points, which allows font sizes to scale up or down in Internet Explorer.

Enhancements in dtSearch Publish include adding "Recognize CD" function to use the CD Wizard to modify a CD that was created on a different computer.

Fixes included in Version 7.01 include:

* dten600.dll: MS Word file parser caused a word break when MS Word inserted redundant font changes within a word

* lbview.exe: Error opening PDF file with URL-encoded apostrophe in filename or path

* dten600.dll: PowerPoint file parser error parsing slide without outline entry text

* dtSearchNetApi.dll: SearchResultsItem did not include modified date or type id

* dten600.dll: SearchResults did not read HitsByWord when serializing from XML

* dtSearch Publish: PDF files did not highlight hits in Adobe Reader 7 in some systems with unpatched versions of IE components.

* dtlndexer.exe: Default setting for Index Au toCommitlntervalMB forced large index updates to commit too frequently, making indexing slower.

7.0 released May 18

With version 7, the product line could was able to search terabytes of text across a desktop, network, Internet or intranet. The 7.0 release covered: dtSearch Web with Spider, dtSearch Desktop with Spider, dtSearch Network with Spider, dtSearch Publish and the dtSearch Text Retrieval Engine.

Before Version 7, the dtSearch index format could hold from 4-8 gigabytes of text per index, but beginning with Version 7, the index format could index more than a terabyte of text in a single index. Search time with more than a terabyte of text is typically less than a second. As with previous versions, a single search can span any number of indexes.

All dtSearch products share the same core search and display functionality, including: over two dozen indexed, unindexed, full-text and fielded data search options, display of HTML, XML and PDF files with highlighted hits and with embedded images, links and formatting intact and built-in HTML converters for display of word processor, database, spreadsheet, e-mail (including attachments), ZIP, Unicode, etc. files, with highlighted hits.

The dtSearch Spider embedded in multiple dtSearch products provides integrated searching of remote web site content, along with locally-available data. In addition to support for the file formats above, the dtSearch Spider can also index and search dynamically-generated content, such as ASP/ASP.NET, MS CMS, MS SharePoint, etc.

The Spider can follow links vertically within a website, or horizontally across websites, to any specified level of depth. The Spider supports public sites, secure content HTTPS sites, password-accessible sites and forms-based authentication. After a search, dtSearch provides integrated relevancy-ranking and other display of local and Spidered content, including WYSWYG display of web-ready content with highlighted hits.

dtSearch Web quickly publishes a large volume of instantly searchable data to an Internet or Intranet site, with Spider functionality. The dtSearch Engine lets developers add dtSearch's built-in file format support and searching to Web-based and other applications. The dtSearch Engine API support includes SQL, Delphi, Java, C , C .NET, O, VB.NET and ASP.NET.

dtSearch Desktop instantly searches popular file types on a PC, while dtSearch Network searches across a network, running in a client/server capacity. dtSearch Publish can quickly publish an instantly searchable document collection to CD, DVD or similar media. The product can also mirror an existing web site on CD/ DVD. The resulting CD/DVD application can run without installing anything on the user's hard disk.

The dtSearch product line offers over two dozen indexed, unindexed, fielded and full-text search options. These include: fuzziness adjustable from O to 10 (to sift through typographical and spelling errors), synonym/concept/thesaurus, boolean (and/or/not), natural language relevancy ranking (by hit term frequency, density and rarity), positional scoring ranking, phrase, phonic, wildcard, bilateral proximity, directed proximity, stemming, numeric range, user-defined variable term weighting, and international language support through Unicode.

Special forensically-oriented features include: automatic parsing of text segments in large data blocks, such as those recovered through an "undelete" process, from unallocated computer space, or from partially recovered file fragments; language recognition algorithms for detecting text in a large variety of languages (Western European, other European, Middle-Eastern, etc.); a proprietary filtering algorithm for scanning recovered data blocks using multiple text encoding detection methods; and automatic recovery of text from corrupt forensically-retrieved documents.

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
CXO UnpluggedSmart Business interviews on BNET

See and hear how senior level executives across the Asia Pacific are developing smart business ideas across a variety of sectors. The focus is on the future, and on how businesses need to evolve.

advertisement
  • Click Here
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with ProQuest