Many data architects have no experience with big data and feel overwhelmed by the number of options available to them (including vendor options, storage options, etc.). They often have little to no comfort with new big data management technologies. There are a few key reasons big data architecture is different than traditional data architecture:
- Big data architecture starts with the data itself, taking a bottom-up approach. Decisions about data influence decisions about components that use data.
- Big data introduces new data sources such as social media content and streaming data.
- The enterprise data warehouse (EDW) becomes a source for big data.
- Master data management (MDM) is used as an index to content in big data about the people, places, and things the organization cares about.
- The variety of big data and unstructured data requires a new type of persistence.
- Analytics capabilities need to be expanded to handle the variety, volume, and velocity of big data.
- Big data applications leverage reporting and visualization in new ways to integrate information and generate new insights.
Before beginning to make technology decisions regarding the big data architecture, make sure a strategy is in place to document architecture principles and guidelines, the organization’s big data business pattern, and high-level functional and quality of service requirements.