Tag Archives: graph

SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

The series so far:

  1. SQL Server Graph Databases – Part 1: Introduction
  2. SQL Server Graph Databases – Part 2: Querying Data in a Graph Database
  3. SQL Server Graph Databases – Part 3: Modifying Data in a Graph Database
  4. SQL Server Graph Databases – Part 4: Working with hierarchical data in a graph database
  5. SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

With the release of SQL Server 2017, Microsoft introduced graph database features to support data sets that contain complex relationships between entities. The graph capabilities are integrated into the database engine and require no special configurations or installations. You can use these features in addition to or independently of the traditional relational structures. For example, you might implement a graph database for a new master data management solution that could benefit from both graph and relational tables.

When creating a graph database, you might be working with new data, existing data, or a combination of both. In some cases, the data might already exist in relational tables, which do not support the graph features. Only node and edge tables in a graph database allow you to use the new capabilities, in which case, you must either copy the data over to the graph tables or forget about using the graph features altogether.

For those interested in the first option, this article demonstrates how to move from a relational structure to a graph structure, using data from the AdventureWorks2017 sample database. The database might not represent the type of data you had in mind, but it provides a handy way to illustrate how to migrate to a graph structure, using a relational schema already familiar to many of you. Such a recognizable structure also helps demonstrate various ways to query the data once it’s in the graph tables.

Moving from Relational Tables to Graph Tables

The AdventureWorks2017 database includes transactional data related to the fictitious company Adventure Works, which sells bicycles and related equipment to retail outlets and online customers. For this article, we’ll focus on the retail outlets that ordered the products, the sales reps who sold the products, and the vendors who supplied the products, along with such details as the number of items ordered and the amount paid for those items.

To retrieve this type of data from the AdventureWorks2017 database as it exists in its current state, you would be accessing different combinations of the tables shown in the following figure.

word image 236 SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

For those who’ve been around SQL Server documentation for a while, such tables as SalesOrderHeader, SalesOrderDetail, Product, and Person should be quite familiar because they’re included in countless examples that demonstrate various ways to work with relational data. However, suppose that you now want to pull some of this information into a graph database, in which case, the data model might look more like the one shown in the next figure.

word image 237 SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

The data model includes only four nodes (Stores, SalesReps, Vendors, and Products) and only three edges (Purchases, Sells, and Supplies). Together these nodes and edges define the following relationships:

  • Stores purchase products
  • Sales reps sell products
  • Vendors supply products

You’ll define these relationships within the edge tables by mapping the originating node to the terminating node for each relationship, as you saw in the first article in this series. The implication here is that you should first populate the node tables before the edge tables so you can reference the originating and terminating node IDs when defining your relationships.

Creating and Populating the Node Tables

Before you can create and populate your node tables, you must determine where to put the tables. For the examples in this article, I created the graph schema within the AdventureWorks2017 database, using the following T-SQL code:

You do not need to locate the graph tables in the graph schema or even in the AdventureWorks2017 database. However, if you plan to try out the examples to follow and want to locate the graph tables elsewhere, be sure to update the T-SQL code accordingly.

With the graph schema in place (or wherever you locate the tables), you can then create and populate the Stores node table, which includes two user-defined columns, StoreID and StoreName, as shown in the following CREATETABLE statement:

The example follows the same procedures used in the first article to create and populate node tables, so be sure to refer back to the article if you’re unsure about what we’re doing here. Keep in mind that you must include the ASNODE clause in the CREATETABLE statement. You can also add whatever other user-defined columns you want to include. SQL Server will automatically generate the table’s $ node_id column.

You can then use an INSERT…SELECT statement to populate the Stores table, as you would with any SQL Server table. In this case, you must join the Sales.Customer table to the Sales.Store table to get the store name. In addition, when supplying values for the StoreID column in the Stores table, you should use the CustomerID value in the Customer table, rather than use the StoreID value in that table, because the SalesOrderHeader table uses the CustomerID value. This approach helps to keep things simpler when populating the edge tables. SQL Server automatically populates the $ node_id column.

That’s all there is to setting up the Stores table, and creating and populating the SalesReps table is even easier:

For this example, you can pull all the data directly from the Person table, limiting the results to those rows with a PersonType value of SP (for salesperson). If you want to include such information as sales quotas or job titles in the table, you must join the Person table to the SalesPerson or Employee table (or both). For this example, however, the Person table is enough.

The next table to create and populate is Products. For this, you can pull all the data from the Production.Product table:

For this example, when retrieving data from the Product table, you should include a WHERE clause that filters the data so that only rows with a FinishedGoodsFlag value of 1 are included. This ensures that you include only salable products in the Products table.

The final node table is Vendors, which gets all its data from the Purchasing.Vendor table:

That’s all there is to creating and populating the node tables. Once they’re in place, you can start in on your edge tables.

Creating and Populating the Edge Tables

Creating an edge table is just as simple as a node table, with a few notable differences. For the edge table, the table definition requires an ASEDGE clause, rather than an ASNODE clause, and the user-defined columns are optional. (Node tables require at least one user-defined column.) In addition, SQL Server automatically generates the $ edge_id column, rather than the $ node_id column.

The first edge table is Orders, which includes three user-defined columns, as shown in the following CREATETABLE statement:

After you create the Orders table, you can add the data, which relies on the SalesOrderHeader and SalesOrderDetail tables to supply the values for the user-defined columns and, more importantly, to provide the structure for defining the relationships between the Stores and Products nodes:

After joining the SalesOrderHeader and SalesOrderDetail tables, the SELECT statement joins the SalesOrderHeader table to the Stores tables, based on the CustomerID and StoreID values. The join uses a subquery to retrieve only the $ node_id and StoreID columns from the Stores table and to rename the $ node_id column to node1. The query will fail if you try to use $ node_id in the SELECT list. You can then join the SalesOrderHeader table to the Products table, using the same logic as when joining to the Stores table.

The node1 and node2 columns returned by the SELECT statement provide the values for the $ from_id and $ to_id columns in the edge table. As you’ll recall from the first article, you must specifically provide these values when inserting data into an edge table. The values are essential to defining the relationships between the originating and terminating nodes. SQL Server automatically populates the $ edge_id column.

The next step is to create and populate the Sells edge table, which works much the same way as the Orders table, even when it comes to the user-defined columns. The main difference is that the relationships originate with the SalesReps table, as shown in the following T-SQL code:

The fact that the Orders and Sells tables include the same user-defined columns points to the possibility that you could create a fifth node table for sales orders and then include columns such as OrderDate in there. However, this approach could make your schema and queries unnecessarily complicated, while providing little benefit. On the other hand, this approach helps to eliminate duplicate data. As with any database, the exact layout of your graph model will depend on the type of data you’re storing and how you plan to query that data.

The last step is to create and populate the Supplies table. In this case, the structure for the relationships is available through the ProductVendor table:

The ProductVendor table does all the product-vendor mapping for you and includes the StandardPrice values. You need only join this table to the Vendors and Products tables to get the originating and terminating node IDs.

Retrieving Store Sales Data

With the graph tables now defined and populated, you’re ready to start querying them, just like you saw in the second article in this series. For example, you can use the following SELECT statement to return information about the products that each store has ordered:

The SELECT statement uses the MATCH function to specify what data to retrieve. As described in the second article, the function lets you define a search pattern based on the relationships between nodes. You can use the function only in the WHERE clause of a query that targets node and edge tables. The following table shows part of the results that the SELECT statement returns. (The statement returns over 60,000 rows.)

word image 238 SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

In the above example, the MATCH clause specifies the relationship store orders product. If you were to retrieve the same data directly from the relational tables, you could not use the MATCH clause. Instead, your query would look similar to the following:

Although this query is more complex than the previous one, you can use it without having to create and populate graph tables. As with any data, you’ll have to determine on a case-by-case basis when a graph database will be useful to your circumstances and which structure will deliver the best-performing queries.

Returning now to the graph tables, you can modify the preceding example by grouping the data based on the stores and products, as shown in the following example:

As you can see, you can use the MATCH function in conjunction with other clauses, including the HAVING clause, which in this case, limits the results to rows with a total quantity greater than 100. The following figure shows the data now returned by the SELECT statement.

word image 239 SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

When implementing a graph database based on existing relational data, you might want to copy only part of the data set into the graph tables, in which case, you’ll likely need to create queries that can retrieve data from both the graph and relational tables. One way to achieve this is to define a common table expression (CTE) that retrieves the graph data and then use the CTE when retrieving the relational data, as shown in the following example:

In this case, the outer SELECT statement joins the data from the CTE to the Product, ProductSubcategory, and ProductCategory tables in order to include the product categories and subcategories in the results, as shown in the following figure.

word image 240 SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

Being able to access both graph and relational data makes it possible to implement a graph database for those complex relationships that can justify the additional work, while still retaining the basic relational structure for all other data.

Retrieving Sales Rep and Vendor Data

Of course, once you have your graph tables in place, you can run a query against any of them. For example, the following query returns a list of sales reps and the products they have sold, along with details about the orders:

As you can see, retrieving information about the Sells relationships works just like returning data about the Orders relationships, but now the results are specific to each sales rep, as shown in the following figure.

word image 241 SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

The results shown here are only a small portion of the returned data. The statement actually returns over 60,000 rows. However, you can aggregate the data just as you saw earlier:

Now the SELECT statement returns only 58 rows, with the first 10 shown below.

word image 242 SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

There’s little difference between returning data based on the Orders relationships or the Sells relationships, except that the originating nodes are different. You can also take the same approach to retrieve vendor data. Just be sure to update the table alias references as necessary, as shown in the following example:

This should all look familiar to you. The SELECT statement uses a CTE to join the graph and relational data together. The following table shows the first 10 rows of the 32 that the statement returns.

word image 243 SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

As you can see, the results include the vendor and product names, along with the product subcategories and categories.

Digging into the Graph Data

Once you get the basics down of how to query your graph tables, you can come up with other ways to understand the relationships between the nodes. For example, the following SELECT statement attempts to identity sales reps who might be focusing too heavily on certain vendors:

The statement groups the data by the name of the sales reps and then by the vendors. The statement also includes a HAVING clause that calculates an amount 50 times the average sales and then compares that to the total sales of each sales rep. Only reps that go over the calculated amount are included in the results, as shown in the following figure.

word image 244 SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

By being able to return this type of information, you can identify patterns that point to anomalies or specific trends in the data set. For instance, suppose you now want to identify the products that stores have bought based on a specified product that they also bought (a scenario sometimes referred to customers who bought this also bought that).

One way to get this information is to use a CTE to retrieve the IDs of the stores that ordered the specified product and then, for each store return the list of other products that the store ordered. To achieve this, use the CTE to qualify your query so it returns only the other products that the stores bought:

The outer SELECT statement returns the list of products that each of the three stores has ordered. The key is to use the IN operator in a WHERE clause condition to compare the StoreId value to a list of store IDs returned by the CTE. You should also include a WHERE clause condition to exclude the product Sport-100 Helmet, Blue. The SELECT statement returns the results shown in the following figure.

word image 245 SQL Server Graph Databases – Part 5: Importing Relational Data into a Graph Database

There are other ways you can get at customers who bought this also bought that information, such as using Python or R, but this approach provides a relatively simple way to get the data from a graph database, without having to jump through too many hoops.

Making the Best of Both Worlds

Because the graph database features are integrated with the database engine, there’s no reason you can’t work with graph and relational data side-by-side, depending on your application requirements and the nature of your data. This integration also gives you the flexibility to incorporate graph tables into an existing relational structure or make them both part of the design when planning for a new application. Keep in mind, however, that the graph features are still new to SQL Server and lack some of the advanced capabilities available to more established graph products. Perhaps after a couple more releases, SQL Server will be a more a viable contender in the graph database market, at least when used in conjunction with relational data.

Let’s block ads! (Why?)

SQL – Simple Talk

SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

The series so far:

  1. SQL Server Graph Databases – Part 1: Introduction
  2. SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

Microsoft incorporated the graph database in SQL Server 2017, providing a logical structure for storing and querying data sets that contain complex many-to-many or hierarchical relationships. The first article in this series introduced you to the basics of graph databases and described how to define node and edge tables and populate them with data. In this article, we turn to the querying side of the equation, with a focus on retrieving related data in multiple node tables.

For the examples in this article, I used the FishGraph database from the first article in this series, except that I added more sample data. The database is based on a fictitious fish-lovers forum and includes three node tables (FishSpecies, FishLover, and FishPost) and three edge tables (Likes, Posts, and LinksTo). The following figure shows the data model used to build the database.

word image 83 SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

The rectangles represent the nodes, and the arrows connecting the nodes represent the edges, with the arrows pointing in the direction of the relationships. You can download the T-SQL script used to create and populate the database at the bottom of this article.

Introducing the MATCH Function

For the most part, querying tables in a graph database works much the same way as querying regular relational tables, although there are some limitations, such as not being able to perform cross-database queries on graph tables. (For more information about graph database limitations, refer to the Microsoft document SQL Graph Architecture.)

Despite the limitations, you should find that most queries work as expected. For example, the following SELECT statement joins the FishLover, Likes, and FishSpecies tables in order to retrieve a list of users who like certain fish species:

The SELECT statement uses the $ node_id, $ from_id, and $ to_id column aliases to join the tables and return the data shown in the following figure

word image 84 SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

As you can see, querying graph tables is a fairly straightforward process, especially when you take advantage of the graph column aliases. Of course, if you’re trying to retrieve data based on more complex relationships, the query itself also becomes more complex and can even get a bit unwieldy. For this reason, Microsoft has added the MATCH function for retrieving related data from graph tables.

The MATCH function lets you define a search pattern based on the relationships between nodes. You can use the function only in the WHERE clause of a SELECT statement that queries node and edge tables. The following syntax shows the elements that go into defining the search pattern:

A search pattern can define one or more relationships. For each relationship, you must identify the originating and terminating nodes, as well as the edge that ties the two nodes together. You must also specify the direction of the relationship, using dashes and arrows, with the edge situated between the two nodes. For example, if you want to define a single relationship that originates with node1 and terminates with node2, you would use the following syntax for your WHERE clause:

Notice that the edge is enclosed in parentheses, with a dash preceding the edge, and a dash and right arrow following the edge. This defines a relationship that moves from left-to-right. You can reverse this order by specifying node1 on the right side of the search pattern and node2 on the left side, with the arrow pointing in the opposite direction:

In either case, the search pattern indicates that the WHERE clause should return only those rows in which a relationship exists between node1 (the originating node) and node2 (the terminating node), as defined in the edge element.

With these basics in mind, you can rewrite the SELECT statement above to simplify the query:

In the FROM clause, you simply list the participating tables – without the ON clause– providing table aliases where appropriate. You can then reference the aliases in the search pattern of the MATCH function. In this case, the search pattern defines the relationship fish lover likes a fish species. The statement will return the same results as those returned by the SELECT statement above.

You can also reverse the order of the relationship so that it is defined from right-to-left:

Again, the SELECT statement returns the same results as the previous two SELECT statements.

You can include other elements in the WHERE clause, in addition to the MATCH function. For example, the following WHERE clause adds a search condition specifying that only rows with a Username value of hooked should be returned:

The statement now returns the results shown in the following figure.

word image 85 SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

As you can see, using the MATCH function to define a search pattern based on a single relationship is a fairly straightforward process. In many cases, however, you’ll want to return data based on multiple relationships, which is where the function can be particularly handy.

Creating Compound MATCH Expressions

If you refer back to the syntax for the MATCH function, you’ll notice that you can use the AND operator when defining your search pattern, allowing you to string together multiple relationships within a single expression. Note, however, that the MATCH function does not support the OR operator or the NOT operator, so the logic you can define is somewhat limited. Even so, the AND operator can still be very useful. For example, the following search pattern uses the operator to string together two relationships:

For the SELECT statement to return a row, a fish lover must like a fish post and the fish post must link to a fish species. In this case, the statement returns only two rows, as shown in the following figure.

word image 86 SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

By being able to link together multiple relationships, you can dig more deeply into how the nodes in a graph database are interconnected. For example, the preceding SELECT statement might help to determine whether users are more inclined to like a post that specifically links to a fish species.

In some cases, you can link together relationships without using the AND operator, as long as your search pattern defines the same logic. For instance, you can rewrite the preceding SELECT statement by eliminating the AND operator and one of the references to the Post table alias:

The SELECT statement returns the same results as the preceding one, even though the search pattern has been simplified. You can also define a search pattern that contains two relationships terminating with the same node, as in the following example:

The first relationship (fish lover likes a fish species) is defined from left-to right, and the second relationship (fish post links to a fish species) is defined from right-to-left. As a result, the SELECT statement returns only those rows in which a fish species is both liked and linked to, giving us the results shown in the following figure.

word image 87 SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

You can also define search patterns that include more than two relationships, using the AND operator where appropriate. For example, the search pattern in the following SELECT statement includes three relationships but only one instance of the AND operator:

Because the Likes table is referenced twice within the search pattern, it must be included twice in the FROM clause, with a different alias assigned to each instance. The search pattern then uses these aliases when defining the three relationships (fish lover likes a fish post, fish post links to a fish species, and fish lover likes a fish species). The SELECT statement will return only those rows in which a fish species is both liked and linked to from a post that is liked, as shown in the following figure.

In this case, the user underwatercasey likes both the fish post and fish species, and the fish post links to the fish species. You can also rewrite the search pattern to eliminate the AND operator altogether:

The SELECT statement returns the same results as the preceding statement but simplifies the search pattern. In some cases, however, you might find that the AND operator makes it easier to read and troubleshoot your code as you heap on more and more relationships.

Defining a Self-referencing Query

Because of the way in which a graph database is structured, it is just as easy to perform a self-referencing query as any other type of query. For example, the following SELECT statement returns a list of fish posts that link to other posts:

To define the relationship in the search pattern, you must include two instances of the FishPost table in the FROM clause, assigning a different alias to each instance, similar to how you included multiple instances of the Likes edge table in the preceding two examples. The SELECT statement returns the results shown in the following figure.

word image 89 SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

You can use the same logic to determine which fish lovers like other fish lovers:

This time the FROM clause includes two instances of the FishLover table, with a unique alias assigned to each one, giving us the results shown in the following figure.

word image 90 SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

You can also use this approach when defining more than two relationships in your search pattern. For example, the following SELECT statement returns a list of users that are liked by user hooked, along with the users that they like:

The FROM clause now includes three instances of the FishLover table and two instances of the Likes table. The search pattern uses these instances to define the relationships fish lover1 likes fish lover2 and fish lover2 likes fish lover3, giving us the results shown in the following figure.

word image 91 SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

Essentially what we have here is a friend-of-a-friend type scenario:

  • The user hooked likes the users powerangler and deepdive.
  • The user powerangler likes the user deepdive.
  • The user deepdive likes the user underwatercasey.

Self-referencing queries make it easy to discover how data is related wherever it resides. Because the relationships are stored within the edge tables, it does not matter if the relationships cross multiple nodes, are contained within a single node, or are a combination of both.

Combining the MATCH Function with Other T-SQL Features

So far in this article, the SELECT statements we’ve covered have included only the SELECT, FROM, and WHERE clauses. However, you can use the MATCH function with other query types, such as queries that group and aggregate data. For example, the following SELECT statement includes the GROUPBY and ORDERBY clauses:

The GROUPBY clause groups the data by the Username values in the FishLover table in order to provide a count of total likes per user, as shown in the following results.

word image 92 SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

Although this is a fairly basic example, it demonstrates how you can build various types of queries based on data from a graph database, just like queries against relational tables. You can even use a graph database in conjunction with Microsoft Machine Learning Services (MLS). For example, the following T-SQL script retrieves data from a graph database and then uses it in a Python script to generate a bar chart:

I won’t go into too much detail about running the Python script because I covered it in my series on Python and MLS. (The series starts with the article SQL Server Machine Learning Services – Part 1: Python Basics.) Basically, the T-SQL script does the following:

  • Declares the @pscript variable and assigns the Python script to the variable. The script creates a data frame based on the data returned from the graph database and then generates a bar chart based on the data.
  • Declares the @sqlscript variable and assigns the SELECT statement to the variable. This is the same SELECT statement shown in the previous example.
  • Passes in the two variables as parameter values to the sp_execute_external_script stored procedure, which retrieves the graph data and runs the Python code.

When you run the sp_execute_external_script stored procedure, the Python engine generates the bar chart shown in the following figure.

word image 93 SQL Server Graph Databases – Part 2: Querying Data in a Graph Database

As these examples demonstrate, graph databases have been well integrated into the SQL Server ecosystem, providing you with a great deal of flexibility for working with graph data. MLS in particular opens up a wide range of possibilities for analyzing and visualizing the data in a variety of ways.

Conclusion

As with storing and populating graph tables, querying those tables is a relatively painless process, once you understand the fundamentals. Where applicable, you can use the MATCH function to streamline your queries. When that’s not practical, you can revert to regular queries to get the data you need. The only trick is to make sure you’re working against an instance of SQL Server 2017 and not an earlier version. Fortunately, you can download the Developer’s edition for free, which comes with all the features available to the Enterprise edition, including support for graph databases. You can then use what you learned so far in this series to start exploring graph databases on your own.

Let’s block ads! (Why?)

SQL – Simple Talk

How to Map Relational Data to a Graph Database

iStock 817442226 e1522345309668 How to Map Relational Data to a Graph Database

Organizations, especially a new class of employees like business analysts, data scientists, and marketing strategists, are actually trying to get better insight using the relations in all the data records of their enterprise.  In order to better serve these new ways of searching for information, it is becoming critical to find a way to expose existing enterprise data in a graph database format. Unfortunately, most of today’s data is stored in relational database management systems (RDBMS).

In the relational data storage model, there is no physical link to indicate the relation among different data records. They are only linked logically by a special attribute called a foreign key, rendering the search of related records not as effective. In order to find relationships in a relational data storage model, you have to build complex queries that perform actions over and over again. It’s not very efficient. While relational databases are widely used, they are not ideal for storing and querying data that have a high degree of relationships.

On the other hand, the graph data storage model keeps data attributes together naturally (in nodes). By creating physical links called edges, all related nodes are kept together, highlighting the relationships between data sets. This focus on relationships in graph database makes it a far better home for today’s interconnected data than traditional relational databases.

But, since most of today’s data is still in relational database format, how do you convert your current data to a graph database format? Mapping data from a relational database to graph database is fundamentally a task of converting the relational representation from one database to the other. More specifically, we can use the foreign keys of relational data model to build edges, thus transforming loosely coupled data records into a highly bounded group of nodes. Nodes form the basis of graph database architecture.

For a detailed walk-through of how to do this, download the brief: “How to Map Relational Data to a Graph Database”. In it, we go step by step and use a well-known example to demonstrate an approach to map information from a relational database to a graph database.

Let’s block ads! (Why?)

The TIBCO Blog

Graph Databases: The Next Generation of Fraud Detection Technology

graph database Graph Databases: The Next Generation of Fraud Detection Technology

This is a follow-up blog post in our Graph Database series. Learn what a graph database is and why it’s important here.

Today’s approach to fraud detection: Discrete Analysis

With its terrific ability to enable users to spot patterns quickly and easily, graph database has wide implications for use in a number of industries including fraud detection. Fraud accounts for billions of dollars lost per year and it seems each year, fraudsters get more sophisticated in outsmarting the banks. Unfortunately, the most common attribute of fraud is misdirection and burying their patterns in lots of data.  With traditional data storage techniques, it really isn’t possible to see beyond individual points to the connections between them. That sort of heightened, overarching pattern-like view has now become more feasible with graph database. The connected data view in graph database can more easily uncover these larger complex patterns and make fraud evasion harder. To understand how useful an addition graph database can be in detecting fraud, it’s best to first understand Gartner’s layered approach to fraud detection.

Gartner analysts say “No single layer of fraud prevention or authentication is enough to keep determined fraudsters out of enterprise systems.” They say that you need a combination of monitoring and controls to combat fraud. Therefore, they recommend following the 5 layered fraud prevention approach. This approach recommends that companies use all five layers, ranging from the most basic of security measures (secure browsing) to the most complex (analysis of relationships), and encourages companies to face the fact that some determined fraudsters will break through the first few layers of security. Because some will make it in.

Most of today’s available fraud prevention solutions only address the first four layers of Gartner’s recommended approach. For instance, banks mainly use transaction monitoring systems (TMS) which rely on relational databases. Due to their linear, discrete analysis approach, most of today’s solutions can usually only spot trends and incidents after they’ve happened and only a limited basis.

While the discrete approach is an easy one that helps users spot patterns and capture fraudsters acting alone, it doesn’t necessarily detect patterns between all the different data endpoints and therefore, is not very useful in detecting elaborate crime rings. Further, modern fraud rings have become very familiar with the ways of discrete data analysis and know how to avoid detection with this approach. This leads us to the next frontier of fraud detection which revolves around connected analysis.

Tomorrow’s approach to fraud detection: Connected Analysis

This is where graph database can really add value. Graph database addresses Gartner’s fifth layer of fraud prevention: entity link analysis. Graph database enables banks to look beyond the individual data points of discrete analysis to the connections that link them. With graph database, banks can see their data in “graphs” and more easily visualize patterns and opportunities to better predict when and where fraud might occur.

Another important trait that makes graph database a value add to any fraud prevention solution is its inherent speed in calculating relationships. Since the relationships in graph database are treated with as much value as the database records themselves, the engine that navigates the connections between nodes can do so efficiently, enabling millions of connections per second. Graph database enables quick extraction of new insight from large and complex databases to help uncover unknown interactions and relationships. This means that with a graph database, banks can process data and compute risks quicker than today’s current relational databases so they can spot opportunities and threats before the competition.

According to Forbes, graph database reduces false positives, improve false negative detection, ease investigations, and reduce overall fraud investigation costs. Fundamentally, fraud detection depends on the ability to analyze the relationships between customers and transactions, and recognize patterns or trends. All at the speed of today’s transactions. Graph database provides the speed and the ability to detect large patterns making it the ideal addition to any fraud prevention solution.

Let’s block ads! (Why?)

The TIBCO Blog

Is there a Bond Graph toolbox or Package?

 Is there a Bond Graph toolbox or Package?

It seems like there was one, but it doesn’t exist anymore

Let’s block ads! (Why?)

Recent Questions – Mathematica Stack Exchange

Using R to illustrate relationships from graph objects

My new article about Graph Database Objects was just published, it’s really a very good new feature.

Another new feature we have since SQL Server 2016 is R language. We can use R language inside the database to analyze our data.

The possibility to use both technologies together is very interesting. Using graph objects we can store relationships between elements, for example, relationships between forum members. Using R scripts we can build a cluster graph from the stored graph information, illustrating the relationships in the graph.

The script below creates a database for our example with a subset of the objects used in my article and a few more relationship records between the forum members.

use master
drop database if exists RGraph
go
create database RGraph
go

use RGraph
go

drop table if exists dbo.Likes
drop table if exists dbo.ForumMembers

CREATE TABLE [dbo].[ForumMembers](
       [MemberID] [int] IDENTITY(1,1) NOT NULL,
       [MemberName] [varchar](100) NULL
)
AS NODE
GO

CREATE TABLE [dbo].[Likes]
AS EDGE

INSERT ForumMembers values (‘Mike’),(‘Carl’),(‘Paul’),(‘Christy’),(‘Jennifer’),(‘Charlie’)

INSERT Likes ($ to_id,$ from_id)
    VALUES
    ((SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 1),
         (SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 2)),
    ((SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 3),
         (SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 2)),
    ((SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 1),
         (SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 4)),
    ((SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 5),
         (SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 4))

insert into likes ($ from_id,$ to_id) values
  ((select $ node_id from dbo.forummembers where MemberName=‘Mike’),
    (select $ node_id from dbo.forummembers where MemberName=‘Paul’)),
  ((select $ node_id from dbo.forummembers where MemberName=‘Paul’),
    (select $ node_id from dbo.forummembers where MemberName=‘Christy’)),
  ((select $ node_id from dbo.forummembers where MemberName=‘Christy’),
    (select $ node_id from dbo.forummembers where MemberName=‘Carl’)),
  ((select $ node_id from dbo.forummembers where MemberName=‘Paul’),
    (select $ node_id from dbo.forummembers where MemberName=‘Jennifer’)),
  ((select $ node_id from dbo.forummembers where MemberName=‘Jennifer’),
    (select $ node_id from dbo.forummembers where MemberName=‘Carl’))

You need to follow a few steps to enable R scripts:

1) Install R in SQL Server

You can follow the instructions in this link to install R in SQL Server

2) Install the package iGraph in R

You can follow the instructions in this link to install iGraph package

3) Enable the use of external scripts. You can execute the following t-sql code:

exec sp_configure ‘external scripts enabled’, 1;

RECONFIGURE;

4) Execute the script

Below is the R script to build a cluster graph using iGraph our relationships stored in SQL Server. You need to change the folder to store the graph.

exec sp_execute_external_script @language = N’R’,
@script = N’

require(igraph)

g <- graph.data.frame(graphdf)

V(g)$ label.cex <- 2

png(filename = “c:\R\plot1.png”, height = 1200, width = 1200, res = 100); plot(g, vertex.label.family = “sans”, vertex.size = 40)

dev.off() ‘,
@input_data_1 = N’select LikeMember.MemberName as LikeMember,    LikedMember.MemberName as LikedMember         from dbo.ForumMembers as LikeMember,  dbo.ForumMembers as LikedMember,  Likes

        where Match(LikeMember-(Likes)->LikedMember) ‘,
@input_data_1_name = N’graphdf’

GO

The image below shows the resulting graph with the relations between forum members:

RAndGraph1 Using R to illustrate relationships from graph objects

It’s interesting to remember about the new feature in SQL Server 2017: We can also build the same script using Python.

Let’s block ads! (Why?)

SQL – Simple Talk

Using R to illustrate relationships from graph objects

My new article about Graph Database Objects was just published, it’s really a very good new feature.

Another new feature we have since SQL Server 2016 is R language. We can use R language inside the database to analyze our data.

The possibility to use both technologies together is very interesting. Using graph objects we can store relationships between elements, for example, relationships between forum members. Using R scripts we can build a cluster graph from the stored graph information, illustrating the relationships in the graph.

The script below creates a database for our example with a subset of the objects used in my article and a few more relationship records between the forum members.

use master
drop database if exists RGraph
go
create database RGraph
go

use RGraph
go

drop table if exists dbo.Likes
drop table if exists dbo.ForumMembers

CREATE TABLE [dbo].[ForumMembers](
       [MemberID] [int] IDENTITY(1,1) NOT NULL,
       [MemberName] [varchar](100) NULL
)
AS NODE
GO

CREATE TABLE [dbo].[Likes]
AS EDGE

INSERT ForumMembers values (‘Mike’),(‘Carl’),(‘Paul’),(‘Christy’),(‘Jennifer’),(‘Charlie’)

INSERT Likes ($ to_id,$ from_id)
    VALUES
    ((SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 1),
         (SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 2)),
    ((SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 3),
         (SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 2)),
    ((SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 1),
         (SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 4)),
    ((SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 5),
         (SELECT $ node_id FROM dbo.ForumMembers WHERE MemberID = 4))

insert into likes ($ from_id,$ to_id) values
  ((select $ node_id from dbo.forummembers where MemberName=‘Mike’),
    (select $ node_id from dbo.forummembers where MemberName=‘Paul’)),
  ((select $ node_id from dbo.forummembers where MemberName=‘Paul’),
    (select $ node_id from dbo.forummembers where MemberName=‘Christy’)),
  ((select $ node_id from dbo.forummembers where MemberName=‘Christy’),
    (select $ node_id from dbo.forummembers where MemberName=‘Carl’)),
  ((select $ node_id from dbo.forummembers where MemberName=‘Paul’),
    (select $ node_id from dbo.forummembers where MemberName=‘Jennifer’)),
  ((select $ node_id from dbo.forummembers where MemberName=‘Jennifer’),
    (select $ node_id from dbo.forummembers where MemberName=‘Carl’))

You need to follow a few steps to enable R scripts:

1) Install R in SQL Server

You can follow the instructions in this link to install R in SQL Server

2) Install the package iGraph in R

You can follow the instructions in this link to install iGraph package

3) Enable the use of external scripts. You can execute the following t-sql code:

exec sp_configure ‘external scripts enabled’, 1;

RECONFIGURE;

4) Execute the script

Below is the R script to build a cluster graph using iGraph our relationships stored in SQL Server. You need to change the folder to store the graph.

exec sp_execute_external_script @language = N’R’,
@script = N’

require(igraph)

g <- graph.data.frame(graphdf)

V(g)$ label.cex <- 2

png(filename = “c:\R\plot1.png”, height = 1200, width = 1200, res = 100); plot(g, vertex.label.family = “sans”, vertex.size = 40)

dev.off() ‘,
@input_data_1 = N’select LikeMember.MemberName as LikeMember,    LikedMember.MemberName as LikedMember         from dbo.ForumMembers as LikeMember,  dbo.ForumMembers as LikedMember,  Likes

        where Match(LikeMember-(Likes)->LikedMember) ‘,
@input_data_1_name = N’graphdf’

GO

The image below shows the resulting graph with the relations between forum members:

RAndGraph1 Using R to illustrate relationships from graph objects

It’s interesting to remember about the new feature in SQL Server 2017: We can also build the same script using Python.

Let’s block ads! (Why?)

SQL – Simple Talk

Graph reduction

Let us consider graph1:
eblaC Graph reduction

g1 = {4612 <-> 4613, 4613 <-> 4614, 4642 <-> 4612, 4614 <-> 4522, 4798 <-> 4642, 4522 <-> 4376, 4536 <-> 4798, 4798 <-> 4996, 4376 <-> 4201, 4338 <-> 4536, 4813 <-> 4996, 4201 <-> 4043, 4074 <-> 4338, 4813 <-> 4735, 4043 <-> 3813, 3796 <-> 4074, 4646 <-> 4735, 3711 <-> 3813, 3665 <-> 3796, 4646 <-> 4585, 3711 <-> 3450, 3509 <-> 3665, 4584 <-> 4585, 3119 <-> 3450, 3177 <-> 3509, 4662 <-> 4584, 3119 <-> 2911, 2890 <-> 3177,4729 <-> 4662, 2911 <-> 2714, 2642 <-> 2890, 4729 <-> 4753, 2551 <-> 2714, 2641 <-> 2642, 4875 <-> 4753, 2518 <-> 2551, 4972 <-> 4875, 2481 <-> 2518, 5081 <-> 4972, 2365 <-> 2481, 4967 <-> 5081, 2320 <-> 2365, 4938 <-> 4967, 2310 <-> 2320, 4937 <-> 4938, 2215 <-> 2310, 2310 <-> 2317, 4942 <-> 4937, 2053 <-> 2215, 2315 <-> 2317, 4923 <-> 4942, 1943 <-> 2053, 2315 <-> 2316, 4922 <-> 4923, 1942 <-> 1943, 2329 <-> 2316, 4880 <-> 4922, 2329 <-> 2248, 4721 <-> 4880, 2248 <-> 2249, 4673 <-> 4721, 4683 <-> 4721, 2249 <-> 2246, 4672 <-> 4673, 4508 <-> 4683, 2246 <-> 2191, 4831 <-> 4672, 4507 <-> 4508, 2191 <-> 2093, 4779 <-> 4831, 2093 <-> 2052, 4551 <-> 4779, 4717 <-> 4779, 2052 <-> 2000, 4551 <-> 4409, 4489 <-> 4717, 2000 <-> 1961, 4274 <-> 4409, 4323 <-> 4489, 1961 <-> 1950, 4224 <-> 4274, 4084 <-> 4323, 1950 <-> 1951, 4223 <-> 4224, 3876 <-> 4084, 1951 <-> 1957, 4336 <-> 4223, 3769 <-> 3876, 1957 <-> 1948, 4336 <-> 4069, 4232 <-> 4336, 3704 <-> 3769, 1948 <-> 1949, 3767 <-> 4069, 4103 <-> 4232, 3545 <-> 3704, 2054 <-> 1949, 3561 <-> 3767, 4055 <-> 4103, 3409 <-> 3545, 2054 <-> 1996, 3415 <-> 3561, 3899 <-> 4055, 3408 <-> 3409, 1996 <-> 1997, 3415 <-> 3377, 3898 <-> 3899, 3425 <-> 3408, 2043 <-> 1997, 3345 <-> 3377, 3905 <-> 3898, 3461 <-> 3425, 2043 <-> 2128, 3277 <-> 3345, 3689 <-> 3905, 3410 <-> 3461, 2091 <-> 2128, 3277 <-> 3105, 3459 <-> 3689, 3360 <-> 3410, 2091 <-> 1946, 2923 <-> 3105, 3458 <-> 3459, 3254 <-> 3360, 1946 <-> 1838, 2822 <-> 2923, 2923 <-> 2894, 3458 <-> 3460, 3239 <-> 3254, 1725 <-> 1838, 2772 <-> 2822, 2894 <-> 2788, 3407 <-> 3460, 3238 <-> 3239, 1725 <-> 1531, 2771 <-> 2772, 2788 <-> 2598, 3406 <-> 3407, 1531 <-> 1342, 2480 <-> 2598, 3514 <-> 3406, 1342 <-> 1276, 2480 <-> 2402, 3321 <-> 3514, 3514 <-> 3504, 1219 <-> 1276, 2402 <-> 2400, 3153 <-> 3321, 3504 <-> 3272, 1219 <-> 1090, 2400 <-> 2401, 3042 <-> 3153,3023 <-> 3272, 1090 <-> 1035, 2793 <-> 3042, 3084 <-> 3042,2850 <-> 3023, 997 <-> 1035, 2424 <-> 2793, 3008 <-> 3084, 2739 <-> 2850, 997 <-> 960, 2134 <-> 2424, 3007 <-> 3008, 2578 <-> 2739, 2739 <-> 2645, 960 <-> 961, 1914 <-> 2134, 2488 <-> 2578, 2645 <-> 2356, 1656 <-> 1914, 2278 <-> 2488, 2195 <-> 2356, 1655 <-> 1656, 2277 <-> 2278, 2195 <->2023, 1896 <-> 2023, 1895 <-> 1896};

gx = Graph[g1];
we1 = Select[SortBy[{VertexList[g1], DegreeCentrality[g1]}[Transpose], Last], #[2] == 1 &][[All, 1]];

we11 = Table[we1[[i]] -> i, {i, 1, Length[we1]}];
we2 = Select[SortBy[{VertexList[g1], DegreeCentrality[g1]}[Transpose],Last],#[2] == 3 &][[All, 1]];
we22 = Table[we2[[i]] -> i + 11, {i, 1, Length[we2]}];
gx = Graph[g1(,GraphLayout[Rule]”SpringEmbedding”),VertexLabels -> Join[we11, we22]];
graph1 = HighlightGraph[gx, {Style[we2, Red], Style[we1, Yellow]}, VertexSize -> 2]

How to reduce the graph1 to graph2?. Graph2: x = {1 <-> 12, 12 <-> 4, 12 <-> 20, 20 <-> 7, 20 <-> 18, 18 <-> 11, 18 <-> 19, 19 <-> 10, 19 <-> 17, 17 <-> 14, 14 <-> 8, 14 <-> 6, 17 <-> 16, 16 <-> 13, 13 <-> 3, 13 <-> 5, 16 <-> 15, 15 <-> 2, 15 <-> 9};

graph2 = Graph[Reverse[x], GraphLayout -> “SpringEmbedding”,VertexLabels -> “Name”]

e0Ump Graph reduction

I have such an algorithm but it works relatively slowly (I need it to very large networks).
Designation of nodes are accidental and does not matter.

Let’s block ads! (Why?)

Recent Questions – Mathematica Stack Exchange

Both axes labels and frame to my graph

 Both axes labels and frame to my graph

Seems like I can’t ad axes labels and a frame to my graph.

I am trying this:

Plot[Blackvolality[k], {k, 0, 0.1}, AxesLabel -> {Volatility, Strike},
PlotLabel -> "Beta=0"]

This work fine^. But when I add frame to it:

Plot[Blackvolality[k], {k, 0, 0.1}, AxesLabel -> {Volatility, Strike},
PlotLabel -> "Beta=0",Frame->True]

Suddenly my Axes labels disappear. How Come? How can I have both?

Let’s block ads! (Why?)

Recent Questions – Mathematica Stack Exchange

Permutations on graph vertex labels

 Permutations on graph vertex labels

I want to generate all the possible adjacency matrices of equivalent unlabeled graphs. For example, consider the simple path graph of three vertices. There are three possible adjacency matrices:

a1={{0, 1, 0}, {1, 0, 1}, {0, 1, 0}};
a2={{0, 1, 1}, {1, 0, 0}, {1, 0, 0}};
a3={{0, 0, 1}, {0, 0, 1}, {1, 1, 0}};

Each matrix corresponding to a different (numerical) labeling of the vertices.

Is there a way to generate the other representations given any one of them for any simply connected graph?

Let’s block ads! (Why?)

Recent Questions – Mathematica Stack Exchange