Full Text Search

Full Text Search

BB16
Sep 29th, 2008
SQL Server 2008: Beyond Relational
A look at the new ways of working with semi- or unstructured data in SQL Server 2008

Speakers

Michael Rhys
Traditionally, databases have been used to store structured data in tables with rows and columns. However, in the real world there is a lot of unstructured or semi-structured data that also need to be stored and retrieved. SQL Server 2008 manages this with a number of new features.
This talk started off with a bunch of pain points and goals of the SQL Server 2008 product, as well as a comparison of features between 2005 and 2008. / Mentioned Technologies
  • SQL Server 2005
  • SQL Server 2008
  • XML
  • FILESTREAM
  • HierarchyID
  • Full Text Search
  • Spatial
  • Sparse

The first major feature that was talked about was the FILESTREAM data type and how it can help developers store BLOB objects in the NTFS file system while maintaining a relationship with a row in a table. This allows maximizing performance while also keeping manageability of the data.
The next issue up was regarding the Full Text Search (FTS) and challenges in it. The session then detailed the number of improvements in SQL Server 2008, including some syntax and performance benchmarks. Next up was one of the most important new features of SQL Server 2008 – Spatial Data. The ability to store and work with Geographical and Geometrical data within SQL Server 2008.
Improvements in the XML datatype were talked about next and then the discussion moved on to working with semi-structured data. First up was the new HierarchyID column and how it helps in storing hierarchies in the database. Then came the usage of Wide, Sparse, and Columns sets that allow a table to have an extremely huge number of (nullable) columns in it. Finally, we ended with a little discussion about the new Filtered Indexes.

List of demos

Pain Overview / Pain points in working with unstructured and semi-structured data / About 5 min into the session
FileStream / How to create and use a FILESTREAM / 10 min into the session
XML Improvements / A look at XML schema improvements / 20 min into the session
Hierarchy ID / Creating and using hierachyID-based tables / 30 min into the session
Sparse Columns & Filtered Index / Working with large sets of nullable columns and creating indexes for only a part of the data in a table / 42 min into the session

Conclusion

SQL Server 2008 provides a huge number of features for the storage and manipulation of both unstructured and semi-structured data. These include:
  • Spatial Data
  • HierarchyID
  • FILESTREAM
  • Sparse Columns
  • Filtered Indexes
  • New FTS
  • XML Improvements
  • And more…

Vinod Unny