Tag Archives: Evaluating

Experts urge caution when evaluating marketing claims for AI tools

TTlogo 379x201 Experts urge caution when evaluating marketing claims for AI tools

Unbridled enthusiasm for all things artificial intelligence is so last year.

While 2016 was marked by huge amounts of hype around the growing prominence of AI tools in the enterprise, the new year is shaping up to be much more skeptical. That hard-nosed realism around all things AI was a big part of the Gartner Data & Analytics Summit in Grapevine, Texas.

“Just because we have AI doesn’t mean we get a better decision. It’s just a tool,” said Scott Zoldi, chief analytics officer at software vendor FICO, in a presentation at the conference. Zoldi leads development of analytics products at FICO that include credit scoring models and fraud detection systems, some incorporating AI and machine learning algorithms.

For Zoldi, advanced machine learning practices, including AI, have followed in the wake of big data in terms of hype. After spending the last five years or so accumulating huge data sets, businesses are now looking for ways to extract value. Machine learning is widely seen as a way of making sense out of and learning from large data volumes. This has in part led to all the hype around AI we’re seeing today, and software vendors are looking to capitalize on the excitement, Zoldi said.

“There’s lots of applications for analytics and AI,” he said. “There’s also lots of ways it can go wrong. It takes a lot of time to learn how to use AI responsibly and not be misled.”

The need for caution when evaluating AI tools was echoed by several Gartner analysts. Alexander Linden said there’s currently a “zoo” of AI technologies, many of which are being promoted by software companies using lofty claims. He pointed to IBM’s marketing claim that its cognitive platform Watson “can think like a human.”

AI that solves everything will remain a fantasy for a very long time.

Alexander Lindenanalyst, Gartner

That’s not the case. Linden said most of today’s AI tools are far from this kind of general intelligence. Instead they focus on relatively narrow tasks. For example, the AI algorithm developed last year by Google’s DeepMind, AlphaGo, which mastered the chess-like game Go, would be unable to compete on the gameshow Jeopardy!, while Watson, which beat human contestants on the show in 2011, would struggle to play Go competently. General, human-like intelligence is still a long way off.

“Marketing messages like [IBM’s] confuse people,” Linden said. “AI that solves everything will remain a fantasy for a very long time.”

In the meantime, analyst Tom Austin of Gartner recommended that enterprises interested in AI tools think more about point products that address specific needs. This includes things like chatbots and intelligent customer assistants. There have been significant advances in AI technologies in recent years, he said, even if those achievements don’t quite match the vendor hype. Right now, purpose-built AI applications tend to perform better than more general-purpose tools, he said.

Austin also recommended that enterprises think about the near term. There’s so much development going on around AI that what looks sleek and shiny today could be obsolete six months from now.

Additionally, most of the big players in the space, like Amazon, Google, IBM and Microsoft, are fighting it out to position their platforms as the de facto standard. It’s too early to say which will win or which will offer the strongest set of AI tools. So getting locked into a longer-term contact at this stage would be a mistake, Austin said.

“You should be focused on quick time to business value,” he said. “Have no patience. There [are] many grandiose schemes that suppliers can talk about, but you want to pick something that will work.”

Let’s block ads! (Why?)

SearchBusinessAnalytics: BI, CPM and analytics news, tips and resources

Best Practices When Evaluating a New Ecommerce Solution

websitelogo Best Practices When Evaluating a New Ecommerce Solution

Posted by Maggie Miller, Senior Commerce Content Manager

Most businesses, no matter how big or small, have some type of online presence in today’s digital world. But as the ecommerce landscape continues to change, many companies are realizing their systems running their business have become outdated, and can no longer effectively support their business or customer expectations for consistent and relevant experiences across touchpoints.

To deliver these seamless, omnichannel experiences, front-end and back-end systems need to be on a single, unified commerce platform. This unified platform will create a central repository for order, customer and inventory data from all channels. This data can then be supplied to front-end customer facing systems, such as ecommerce, POS and call centers, ensuring accurate and relevant information across all customer touchpoints. This will result in improved business efficiencies and customer experience.

In a recent webinar hosted by Ultra Consulting, Sanjay Mehta, Oracle’s NetSuite Industry Principal for ecommerce, discusses the value of a unified commerce platform and outlines some best practices to consider when looking for an ecommerce solution, including:

Design flexibility and tools for experience management

As a brand you need to have unlimited control over your user experience and design. This allows you to differentiate yourself while driving a great experience for your customers. Business users should be enabled through tools to update and maintain rather than being beholden to technical staff.

Leverage a pre-built, starting-point store

Why reinvent the wheel? Take advantage of as much prebuilt leading practices and customize from there. Reduce your time to market and costs in turn as well.

Must be mobile 

Mobile is now pretty much part of every shopping journey. Research shows more than 50 percent of B2B buyers are using their mobile devices to learn about products. Providing a responsive experience to accommodate your customers’ devices is key to a great customer experience and improving your search engine results.

Provide self-service account management

Online account management allows you to off-load resources and time from your call center or service department and even sale reps. Most customers today expect to be able to view their orders, cancel an order, check the status of their order or get help any time they need it. For B2B buyers, besides the account management capabilities, additional online capabilities are important such as viewing and paying invoices, creating saved lists for frequenty purchased items and generating quotes.

Create a rich, interactive and personalized shopping experience

Dynamic merchandising, faceted search and guided navigation all help you drive sales whether you’re selling to consumers or businesses. Personalizing and localizing the experience to your customers based on their data from your back-end systems of customer, inventory and order information, will make your engagements more relevant with them, driving customer delight and retention.

To learn more, and to see a demo of Oracle’s NetSuite ecommerce solution, SuiteCommerce, watch the on-demand webinar, Ecommerce Best Practices with NetSuite.

Posted on Wed, February 8, 2017 by NetSuite filed under

Let’s block ads! (Why?)

The NetSuite Blog

Evaluating Shared Expressions in Tabular 1400 Models

In our December blog post; Introducing a Modern Get Data Experience for SQL Server vNext on Windows CTP 1.1 for Analysis Services, we mentioned SSDT Tabular does not yet support shared expressions, but the CTP 1.1 Analysis Services engine already does. So, how can you get started using this exciting new enhancement to Tabular models now? Let’s take a look.

With shared expressions, you can encapsulate complex or frequently used logic through parameters, functions, or queries. A classic example is a table with numerous partitions. Instead of duplicating a source query with minor modifications in the WHERE clause for each partition, the modern Get Data experience lets you define the query once as a shared expression and then use it in each partition. If you need to modify the source query later, you only need to change the shared expression and all partitions that refer to it to automatically pick up the changes.

In a forthcoming SSDT Tabular release, you’ll find an Expressions node in Tabular Model Explorer which will contain all your shared expressions. However, if you want to evaluate this capability now, you’ll have to create your shared expressions programmatically. Here’s how:

  1. Create a Tabular 1400 Model by using the December release of SSDT 17.0 RC2 for SQL Server vNext CTP 1.1 Analysis Services. Remember that this is an early preview. Only install the Analysis Services, but not the Reporting Services and Integration Services components. Don’t use this version in a production environment. Install fresh. Don’t attempt to upgrade from previous SSDT versions. Only work with Tabular 1400 models using this preview version. For Multidimensional as well as Tabular 1100, 1103, and 1200 models, use SSDT version 16.5.
  2. Modify the Model.bim file from your Tabular 1400 project by using the Tabular Object Model (TOM). Apply your changes programmatically and then serialize the changes back into the Model.bim file.
  3. Process the model in the preview version of SSDT Tabular. Just keep in-mind that SSDT Tabular doesn’t know yet how to deal with shared expressions, so don’t attempt to modify the source query of a table or partition that relies on a shared expression as SSDT Tabular may become unresponsive.

Let’s go through these steps in greater detail by converting the source query of a presumably large table into a shared query, and then defining multiple partitions based on this shared query. As an optional step, afterwards you can modify the shared query and evaluate the effects of the changes across all partitions. For your reference, download the Shared Expression Code Sample.

If you want to follow the explanations on your own workstation, create a new Tabular 1400 model as explained in Introducing a Modern Get Data Experience for SQL Server vNext on Windows CTP 1.1 for Analysis Services. Connect to an instance of the AdventureWorksDW database, and import among others the FactInternetSales table. A simple source query suffices, as in the following screenshot.

FactInternetSalesSourceQuery Evaluating Shared Expressions in Tabular 1400 Models

As you’re going to modify the Model.bim file of a Tabular project outside of SSDT, make sure you close the Tabular project at this point. Then start Visual Studio, create a new Console Application project, and add references to the TOM libraries as explained under “Working with Tabular 1400 models programmatically” in Introducing a Modern Get Data Experience for SQL Server vNext on Windows CTP 1.1 for Analysis Services.

The first task is to deserialize the Model.bim file into an offline database object. The following code snippet gets this done (you might have to update the bimFilePath variable). Of course, you can have a more elaborate implementation using OpenFileDialog and error handling, but that’s not the focus of this article.

string bimFilePath = @”C:\Users\Administrator\Documents\Visual Studio 2015\Projects\TabularProject1\TabularProject1\Model.bim”;
var tabularDB = TOM.JsonSerializer.DeserializeDatabase(File.ReadAllText(bimFilePath));

The next task is to add a shared expression to the model, as the following code snippet demonstrates. Again, this is a bare-bones minimum implementation. The code will fail if an expression named SharedQuery already exists. You could check for its existence by using: if(tabularDB.Model.Expressions.Contains(“SharedQuery”)) and skip the creation if it does.

tabularDB.Model.Expressions.Add(new TOM.NamedExpression()
    Kind = TOM.ExpressionKind.M,
    Name = “SharedQuery”,
    Description = “A shared query for the FactInternetSales Table”,
    Expression = “let”
        +      Source = AS_AdventureWorksDW,”
        +      dbo_FactInternetSales = Source{[Schema=\”dbo\”,Item=\”FactInternetSales\”]}[Data]”
        +  “in”
        +      dbo_FactInternetSales”,

Perhaps the most involved task is to remove the existing partition from the target (FactInternetSales) table and create the desired number of new partitions based on the shared expression. The following code sample creates 10 partitions and uses the Table.Range function to split the shared expression into chunks of up to 10,000 rows. This is a simple way to slice the source data. Typically, you would partition based on the values from a date column or other criteria.

for(int i = 0; i < 10; i++)
    tabularDB.Model.Tables[“FactInternetSales”].Partitions.Add(new TOM.Partition()
        Name = string.Format(“FactInternetSalesP{0}”, i),
        Source = new TOM.MPartitionSource()
            Expression = string.Format(“Table.Range(SharedQuery,{0},{1})”, i*10000, 10000),

The final step is to serialize the resulting Tabular database object with all the modifications back into the Model.bim file, as the following line of code demonstrates.

File.WriteAllText(bimFilePath, TOM.JsonSerializer.SerializeDatabase(tabularDB));

Having serialized the changes back into the Model.bim file, you can open the Tabular project again in SSDT. In Tabular Model Explorer, expand Tables, FactInternetSales, and Partitions, and verify that 10 partitions exist, as illustrated in the following screenshot. Verify that SSDT can process the table by opening the Model menu, pointing to Process, and then clicking Process Table.

ProcessTable 1024x673 Evaluating Shared Expressions in Tabular 1400 Models

You can also verify the query expression for each partition in Partition Manager. Just remember, however, that you must click the Cancel button to close the Partition Manager window. Do not click OK –   with the December 2016 preview release, SSDT could become unresponsive.

Congratulations! Your FactInternetSales now effectively uses a centralized source query shared across all partitions. You can now modify the source query without having to update each individual partition. For example, you might decide to remove the ‘SO’ part from the values in the SalesOrderNumber column to get the order number in numeric form. The following screenshot shows the modified source query in the Advanced Editor window.

ModifiedQuery 1024x390 Evaluating Shared Expressions in Tabular 1400 Models

Of course, you cannot edit the shared query in SSDT yet. But you could import the FactInternetSales table a second time and then edit the source query on that table. When you achieve the desired result, copy the M script into your TOM application to modify the shared expression accordingly. The following lines of code correspond to the screenshot above.

tabularDB.Model.Expressions[“SharedQuery”].Expression = “let”
    +     Source = AS_AdventureWorksDW,”
    +     dbo_FactInternetSales = Source{[Schema=\”dbo\”,Item=\”FactInternetSales\”]}[Data],”
    +     #\”Split Column by Position\” = Table.SplitColumn(dbo_FactInternetSales,\”SalesOrderNumber\”,Splitter.SplitTextByPositions({0, 2}, false),{\”SalesOrderNumber.1\”, \”SalesOrderNumber\”}),”
    +     #\”Changed Type\” = Table.TransformColumnTypes(#\”Split Column by Position\”,{{\”SalesOrderNumber.1\”, type text}, {\”SalesOrderNumber\”, Int64.Type}}),”
    +     #\”Removed Columns\” = Table.RemoveColumns(#\”Changed Type\”,{\”SalesOrderNumber.1\”})”
    + “in”
    +     #\”Removed Columns\””;

One final note of caution: If you remove columns in your shared expression that already exist on the table, make sure you also remove these columns from the table’s Columns collection to bring the table back into a consistent state.

That’s about it on shared expressions for now. Hopefully in the not-so-distant future, you’ll be able to create shared parameters, functions, and queries directly in SSDT Tabular. Stay tuned for more updates on the modern Get Data experience. And, as always, please send us your feedback via the SSASPrev email alias here at Microsoft.com or use any other available communication channels such as UserVoice or MSDN forums. You can influence the evolution of the Analysis Services connectivity stack to the benefit of all our customers.

This article passed through the Full-Text RSS service – if this is your content and you’re reading it on someone else’s site, please read the FAQ at fivefilters.org/content-only/faq.php#publishers.
Recommended article: The Guardian’s Summary of Julian Assange’s Interview Went Viral and Was Completely False.

Analysis Services Team Blog

How do I insert OwnValues inside a held expression without evaluating it?

 How do I insert OwnValues inside a held expression without evaluating it?

Here is a very long and complicated expression, which we abbreviate as a. I store it using SetDelayed because I want to perform algebraic manipulations on it:

a := 1 + 1

Here is a really complicated function f with attributes HoldFirst that operates on its first argument. It counts the number of times 1 appears in its first argument.

SetAttributes[f, HoldFirst];

f[input_] := Module[{expr=Hold[input]},

As you can see, directly inserting the complicated expression works, but not if you insert the abbreviation a:

(*2*)      (* good *)

(*0*)      (* not good *)

The reason it doesn’t work is because Hold doesn’t allow inserting a definition. So, in the second example, Count is seeing the symbol a and not the expression 1+1 to which it points.

Question: How do I insert OwnValues verbatim inside a held expression without evaluating it?


a := 1 + 1

Here is a sample held expression containing symbols which may or may not have OwnValues:

Hold[a + b + c]

How do I insert the RHS of the definition of a verbatim into the held expression, so that the result is this?:

Hold[(1 + 1) + b + c]

I have the following (which may or may not be fruitful):

Hold[a + b + c] /. (symb_Symbol /; OwnValues[symb] =!= {} :> 

(*  Hold[(HoldPattern[a] :> 1 + 1) + b + c] *)

Let’s block ads! (Why?)

Recent Questions – Mathematica Stack Exchange

What to consider when evaluating Hadoop software distributions

TTlogo 379x201 What to consider when evaluating Hadoop software distributions

Apache Hadoop is at the heart of many big data environments, supporting large-scale, data-intensive applications. Its variety of open source software components and related tools for capturing, processing, managing and analyzing data, and the low overall cost of Hadoop clusters, are alluring to lots of organizations. But, as this series has examined, the open source Hadoop framework only offers so much, and companies that need more robust performance and functionality capabilities as well as maintenance and support are turning to commercial Hadoop software distributions.

Because Hadoop is a technology that’s managed via The Apache Software Foundation’s open source process, the sales model of Hadoop distribution vendors differs from that of proprietary software development companies. The Hadoop source code is open, meaning that it’s available to anyone who wants to access it, so product offerings have to be differentiated by what the vendors provide beyond the openly accessible functionality.

Once you’ve determined that your organization could benefit from a commercial Hadoop big data distribution, the next step is to explore some value-added supplements to the code base and key features offered by Hadoop vendors and determine how these offerings match your needs.

What are the Hadoop distribution vendors really selling?

IT teams can download Hadoop from the Apache website and deploy it on a hardware cluster themselves, without any vendor involvement. But Hadoop vendors are aware that the self-starter approach isn’t for everyone, so they provide prebuilt Hadoop distributions that can be downloaded from their websites — typically in both a free community edition and an enterprise edition that adds more features and requires the purchase of a license. But if these vendors are providing users with a product, what are they really selling? In other words, what do you actually get when you engage and pay a Hadoop software vendor?

Vendors offering commercial versions of open source technologies, such as those providing big data management systems based on Hadoop, follow an alternative system and services model in which customers effectively subscribe to the enterprise edition of the product. Benefits of subscribing to an enterprise edition include:

Access to enterprise features. The subscription relationship enables customers to access versions of Hadoop that have features and optimizations that haven’t been openly released to the open source community.

Release from restrictions. In some situations, the freely downloadable Hadoop distributions have been built with restrictions, such as a limit to the number of nodes on which the system can be run or the amount of data that can be managed. Buying an enterprise subscription lifts these restrictions.

Responsive technical support. Enterprise subscriptions provide availability of resources for support with 24/7 telephone access and response times that can be guaranteed under service-level agreements, depending on the level of support purchased.

Advanced training. While all website visitors may have access to some training materials and videos, enterprise subscribers typically are entitled to more advanced and extensive training sessions.

Access to deployment experts. Hadoop vendors have professional services teams that are experienced in big data management deployments and can help jump-start a customer’s implementation.

Key considerations for comparing Hadoop distribution vendors

The enterprise editions of vendor Hadoop distributions all provide the core components of the Hadoop ecosystem stack, which include the Hadoop Distributed File System (HDFS), the MapReduce programming and execution environment for batch processing, and the YARN job scheduler and cluster resource manager. They also commonly incorporate various other open source technologies, such as the Spark data processing engine and HBase database. But different vendors may support different releases of all those technologies, and newer or more specialized tools may not be universally supported. If your organization is looking to use a particular technology as part of a Hadoop deployment, you should ensure that the distributions you’re considering support it and, if so, which release they’re currently on.

Beyond these typical components, you should also compare and contrast how each vendor provides the following:

Access to enterprise-class features. Some Hadoop vendors offer additional tools that aren’t part of the open source distribution for system configuration, system performance, ongoing monitoring and administration. While these may add value to the enterprise distribution, recognize that integration with proprietary components may lock the customer into that vendor’s product.

Infrastructure deployment alternatives. Your organization may choose to adopt different underlying infrastructure options, such as running on-premises, in the cloud or in virtualized environments. Consider how the Hadoop distribution alternatives are adaptable to these infrastructure choices.

Interoperability with other data management systems. In most cases, an organization will have existing data warehousing, business intelligence and analytics systems in place. Hadoop typically doesn’t fully replace these systems, but rather augments and complements them. So it’s critical that the adopted Hadoop environment enable access and data exchange with existing data management platforms such as DB2, Oracle, SQL Server, Teradata and others.

Integration with end-user tools. End users will want to continue using their favorite tools for business intelligence, reporting, visualization and analytics. Assess how well the Hadoop big data management vendor’s distribution supports integration with the tools used in your organization.

Security and data protection. The Apache Hadoop ecosystem is still maturing, which means that not all of its components may meet enterprise expectations for data security and protection. Many Hadoop vendors provide security features as add-ons.

Support options. Consider what your support requirements are in terms of availability and response times. Vendors offer different plans for support availability as well as response windows.

Indemnification from litigation from use of open source technology. This increasingly important concept ensures that vendors of open source technologies protect their users from potential liabilities related to the use of the product.

Optimized performance. Enterprise distributions may be augmented with performance optimizations that enhance scalability and extensibility.

One additional consideration when comparing Hadoop distribution vendor offerings relates to the approach that vendors are taking toward compatibility within the open source community and interoperability between product offerings from different companies. Ideally, this means ensuring that Hadoop distributions will remain compatible with the open source versions of Hadoop and other Apache technologies, even as vendors make code changes and develop proprietary add-ons. That could help prevent vendor and version lock-in, in which an organization becomes bound to a particular distribution of Hadoop.

However, there’s a lack of unanimity among Hadoop vendors on how best to enable interoperability. Several have formed a group called the Open Data Platform Initiative, set up within the Linux Foundation open source consortium, to develop a common set of interoperability standards for Hadoop. But other vendors have declined to join the group, saying that compatibility and interoperability issues are already being sufficiently addressed within Apache. Assuring alignment with the open source distribution as a standard is certainly desirable in that it allows Hadoop users to maintain some flexibility in their choice of vendors.

Prior to engaging vendors, it’s also important to assess what types of applications your company plans to develop and run using the Hadoop ecosystem, and the required capabilities. Then determine which of these are provided by the community open source versions of Hadoop and other technologies and which require additional functions only provided by a specific Hadoop software vendor.

Weighing all of these factors will help prepare your organization to move forward and evaluate the available options. In our next article, we will assess the similarities and differences between the leading Hadoop distributions.

This entry passed through the Full-Text RSS service – if this is your content and you’re reading it on someone else’s site, please read the FAQ at fivefilters.org/content-only/faq.php#publishers.

SearchBusinessAnalytics: BI, CPM and analytics news, tips and resources

Evaluating the different types of DBMS products

The database management system (DBMS) is the heart of today’s operational and analytical business systems. Data is the lifeblood of the organization and the DBMS is the conduit by which data is stored, managed, secured and served to applications and users. But there are many different forms and types of DBMS products on the market, and each offers its own strengths and weaknesses. 

Relational databases, or RDBMSes, became the norm in IT more than 30 years ago as low-cost servers became powerful enough to make them widely practical and relatively affordable. But some shortcomings became more apparent in the Web era and with the full computerization of business and much of daily life. Today, IT departments trying to process unstructured data or data sets with a highly variable structure may also want to consider NoSQL technologies. Applications that require high-speed transactions and rapid response rates, or that perform complex analytics on data in real time or near real time, can benefit from in-memory databases. And some IT departments will want to consider combining multiple database technologies for some processing needs.

The DBMS is central to modern applications, and choosing the proper database technology can affect the success or failure of your IT projects and systems. Today’s database landscape can be complex and confusing, so it is important to understand the types and categories of DBMSes, along with when and why to use them. Let this document serve as your roadmap.

DBMS categories and models

Until relatively recently, the RDBMS was the only category of DBMS worth considering. But the big data trend has brought new types of worthy DBMS products that compete well with relational software for certain use cases. Additionally, an onslaught of new technologies and capabilities are being added to DBMS products of all types, further complicating the database landscape.

The RDBMS: However, the undisputed leader in terms of revenue and installed base continues to be the RDBMS. Based on the sound mathematics of set theory, relational databases provide data storage, access and protection with reasonable performance for most applications, whether operational or analytical in nature. For more than three decades, the primary operational DBMS has been relational, led by industry giants such as Oracle, Microsoft (SQL Server) and IBM (DB2). The RDBMS is adaptable to most use cases and reliable; it also has been bolstered by years of use in industry applications at Fortune 500 (and smaller) companies. Of course, such stability comes at a cost: RDBMS products are not cheap.

Support for ensuring transactional atomicity, consistency, isolation and durability — collectively known as the ACID properties — is a compelling feature of the RDBMS. ACID compliance guarantees that all transactions are completed correctly or that a database is returned to its previous state if a transaction fails to go through.

Given the robust nature of the RDBMS, why are other types of database systems gaining popularity? Web-scale data processing and big data requirements challenge the capabilities of the RDBMS. Although RDBMSes can be used in these realms, DBMS offerings with more flexible schemas, less rigid consistency models and reduced processing overhead can be advantageous in a rapidly changing and dynamic environment. Enter the NoSQL DBMS.

The NoSQL DBMS: Where the RDBMS requires a rigidly defined schema, a NoSQL database permits a flexible schema, in which every data element need not exist for every entity. For loosely defined data structures that may also evolve over time, a NoSQL DBMS can be a more practical solution.

Another difference between NoSQL and relational DBMSes is how data consistency is provided. The RDBMS can ensure the data it stores is always consistent. Most NoSQL DBMS products offer a more relaxed, eventually consistent approach (though some provide varying consistency models that can enable full ACID support). To be fair, most RDBMS products also offer varying levels of locking, consistency and isolation that can be used to implement eventual consistency, and many NoSQL DBMS products are adding options to support full ACID compliance.

So NoSQL addresses some of the problems encountered by RDBMS technologies, making it simpler to work with large amounts of sparse data. Data is considered to be sparse when not every element is populated and there is a lot of “empty space” between actual values. For example, think of a matrix with many zeroes and only a few actual values.

But while certain types of data and use cases can benefit from the NoSQL approach, using NoSQL databases can come at the price of eliminating transactional integrity, flexible indexing and ease of querying. Further complicating the issue is that NoSQL is not a specific type of DBMS, but a broad descriptor of four primary categories of different DBMS offerings:

  • Key-value
  • Document
  • Column store
  • Graph

Each of these types of NoSQL DBMS uses a different data model with different strengths, weaknesses and use cases to consider. A thorough evaluation of NoSQL DBMS technology requires more in-depth knowledge of each NoSQL category, along with the data and application needs that must be supported by the DBMS. 

The in-memory DBMS: One last major category of DBMS to consider is the in-memory DBMS (IMDBMS), sometimes referred to as a main memory DBMS. An IMDBMS relies mostly on memory to store data, as opposed to disk-based storage.

The primary use case for the IMDBMS is to improve performance. Because the data is maintained in memory, as opposed to on a disk storage device, I/O latency is greatly reduced. Mechanical disk movement, seek time and transfer to a buffer can be eliminated because the data is immediately accessible in memory.

An IMDBMS can also be optimized to access data in memory, as opposed to a traditional DBMS that is optimized to access data from disk. IMDBMS products can reduce overhead because the internal algorithms usually are simpler, with fewer CPU instructions.

A growing category of DBMS is the multi-model DBMS, which supports more than one type of storage engine. Many NoSQL offerings support more than one data model — for example, document and key-value. RDBMS products are evolving to support NoSQL capabilities, such as adding a column store engine to their relational core.

Other DBMS categories exist, but are not as prevalent as relational, NoSQL and in-memory:

  • XML DBMSes are architected to support XML data, similar to NoSQL document stores. However, most RDBMS products today provide XML support.
  • A columnar database is a SQL database system optimized for reading a few columns of many rows at once (and is not optimized for writing data).
  • Popular in the 1990s, object-oriented (OO) DBMSes were designed to work with OO programming languages, similar to NoSQL document stores.
  • Pre-relational DBMSes include hierarchical systems — such as IBM IMS — and network systems — such as CA IDMS — running on large mainframes. Both still exist and support legacy applications.

Additional considerations      

As you examine the DBMS landscape, you will inevitably encounter many additional issues that require consideration. At the top of that list is platform support. The predominant computing environments today are Linux, Unix, Windows and the mainframe. Not every DBMS is supported on each of these platforms.

Another consideration is vendor support. Many DBMS offerings are open source, particularly in the NoSQL world. The open source approach increases flexibility and reduces initial cost of ownership. However, open source software lacks support unless you purchase a commercial distribution. Total cost of ownership can also be higher when you factor in the related administration, support and ongoing costs.

You might also choose to reduce the pain involved in acquisition and support by using a database appliance or deploying in the cloud. A database appliance is a preinstalled DBMS sold on hardware that is configured and optimized for database applications. Using an appliance can dramatically reduce the cost of implementation and support because the software and hardware are designed to work together.

Implementing your databases in the cloud goes one step further. Instead of implementing a DBMS at your shop, you can contract with a cloud database service provider to implement your databases using the provider’s service.

The next step

If your site is considering a DBMS, it’s important to determine your specific needs as well as examine the leading DBMS products in each category discussed here. Doing so will require additional details on each of the different types of DBMS, as well as a better understanding of the specific use cases for which each database technology is optimized. Indeed, there are many variables that need to be evaluated to ensure you make a wise decision when procuring database management system software.

About the author:
Craig S. Mullins is a data management strategist, researcher, consultant and author with more than 30 years of experience in all facets of database systems development. He is president and principal consultant of Mullins Consulting Inc. and publisher/editor of TheDatabaseSite.com. Email him at craig@craigmullins.com.

Email us at editor@searchdatamanagement.com and follow us on Twitter: @sDataManagement.

Recommended article: Chomsky: We Are All – Fill in the Blank.
This entry passed through the Full-Text RSS service – if this is your content and you’re reading it on someone else’s site, please read the FAQ at fivefilters.org/content-only/faq.php#publishers.

SearchBusinessAnalytics: BI, CPM and analytics news, tips and resources