• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Tag Archives: Database

How SQL Server synonyms help database DevOps

March 10, 2021   BI News and Info
SimpleTalk How SQL Server synonyms help database DevOps

Synonyms inside SQL Server are one of those useful but forgotten features. A synonym is a database level object that allows you to provide an alternative name for another database object such as a view, user defined table, scalar function, stored procedure, inline table valued function (tvf), or extended stored procedure. They can also be used for CLR Assembly related stored procedures, CLR tvf, CLR scalar functions or even CLR aggregate functions. There are many practical uses for synonyms, and I’ll explain how to create them and some use cases.

You can create a synonym using the GUI in SSMS or via a script. Here’s a sample script:

USE [AdventureWorks2014]

GO

CREATE SYNONYM [dbo].[EmployeeDemo]

FOR [MRSurfacePro].[AdventureWorks2014].[HumanResources].[Employee]

GO

The example creates an alternate name for the HumanResources.Employee table. Note that creating the synonym requires a four-part name, including the server. Here’s an example of how to use the new synonym:

SELECT * from dbo.[EmployeeDemo]

If you also query the original table, you’ll see that it returns the same rows. You can reference the new name in code, linked servers, applications, and more. In situations where you would need to use a three- or four-part name, you can just use the synonym without having to use the multipart naming which greatly reduces the need for code changes.

There are a few caveats to this to keep in mind. Synonyms cannot be referenced things like CHECK constraints, computed columns, default expressions, rules expressions, schema bound views or functions. They also cannot be used in DDL (Data Definition Language) statements, to make changes to the underlying schema the synonym represents, you must reference the actual object name within any DDL statement.

We all know when developing objects, the naming that was used at the beginning of a project can change and get better over time. Using synonyms can be a real project time saver. You can continue to be agile and not have a huge need to back port changes. Synonyms can also help with database migrations from one server to another. How many times have you had to migrate to a server with a new name? This simplifies the process. All you would have to do is change the synonym definition four-part name; no other changes would be required.

Imagine how easy this can make DevOps. Say you have a cross-database view and database names are different on the development server or servers in the pipeline. If you use synonyms, no additional code changes are needed when referencing the view, and a post-deployment script could just change the synonym definition depending on the environment.

Synonyms simplify and remove the need for code changes. It’s definitely something to consider. I am always cautious as a DBA to recommend things like this as it tends to make it more difficult to track down issues and troubleshoot back to the original table or source, so be sure to document them. There are times, however, in which we need to accomplish things as I explained above, and synonyms are a great avenue to do so.

Commentary Competition

Enjoyed the topic? Have a relevant anecdote? Disagree with the author? Leave your two cents on this post in the comments below, and our favourite response will win a $ 50 Amazon gift card. The competition closes two weeks from the date of publication, and the winner will be announced in the next Simple Talk newsletter.

Let’s block ads! (Why?)

SQL – Simple Talk

Read More

SQL Server authentication methods, logins, and database users

March 3, 2021   BI News and Info

Setting up and managing SQL Server security is an important part of building and maintaining your SQL Server environment. SQL Server security is a vast topic that cannot be covered in a single article. This article starts with a few foundation topics of SQL Server security: SQL Server Authentication methods, logins and database users.

Authentication methods supported

There are two different authentication methods for connecting to SQL Server: Windows and SQL Server.

Windows authentication requires a user to first authenticate to Windows with their login and password. Once a user has been authenticated to Windows, they can then connect to SQL Server using Windows authentication. That is, provided their Windows account has been granted access to SQL Server via a login (more on logins later). Windows authentication is tightly coupled with Windows Security and is also known as Integrated Security. Windows authentication works great when a person is part of a Windows domain.

But there are times when people can’t connect to Windows; this is where SQL authentication comes in. SQL Authentication is less secure than Windows Authentication. In order to connect to SQL Server using SQL authentication, a person needs to provide a login and password when they connect. The password for a SQL Authenticated login is stored in the master database. Because the password is stored in a SQL database, it is more easily hacked. It can also be backed up and restored with a database backup which is why it is less secure than using Windows authentication.

Windows Authentication is the default authentication mode when installing a SQL Server instance. In order to support SQL authentication, you need to configure SQL Server to support mixed mode authentication. When mixed mode is used, both Windows and SQL authentication can be used to connect to SQL Server. If SQL Server is not set up to support mixed mode, then only Windows accounts can connect to SQL Server.

Because SQL authentication is less secure than Windows Authentication, it is recommended that you only set up your SQL Server instance to support mixed mode if you have a requirement to support users or applications that can’t connect to Windows. Even though Windows Authentication is more secure and the recommended practice for connecting to SQL Server, many custom application software vendors still don’t support connecting via Windows authentication.

Setting up SQL Server to support different authentication modes

When installing SQL Server, you have the choice to support only Windows authentication or both authentication methods, which is known as mixed mode. During the installation process, you decide whether or not to use mixed mode when defining the database engine configuration, as shown in Figure 1.

word image 126 SQL Server authentication methods, logins, and database users

Figure 1: Selecting Authentication Methods

Windows Authentication is selected as the default shown with the red arrow in Figure 1. If you need to support both Windows and SQL Server authentication, then you would select the “Mixed Mode” radio button. Upon clicking this button, the SA account password boxes would become enabled, and you would need to specify a password for the SA account. When only Windows authentication is selected, the SA account is disabled. To secure the SA account when you are using mixed mode you can disable the SA account after it is enabled.

How to determine which authentication methods are supported

You can check to see which authentication method is configured in several ways. One of those ways is to use SQL Server Management Studio (SSMS). To use SSMS, first right click on the Instance name and select the Properties option. When I do that on my instance, the properties page in Figure 2 is displayed.

word image 127 SQL Server authentication methods, logins, and database users

Figure 2: Determining Authentication Mode

Figure 2 shows that my instance supports mixed mode authentication because the radio button next to the red arrow is enabled.

Another method to check which authentication modes are set up is to use TSQL code. The code in Listing 1 displays the Authentication mode setup.

SELECT CASE SERVERPROPERTY(‘IsIntegratedSecurityOnly’)  

          WHEN 1 THEN ‘Windows Authentication Only’  

          WHEN 0 THEN ‘Windows and SQL Server Authentication’  

       END as [Authentication Mode];

Listing 1: Displaying Authentication mode

Changing authentication methods After SQL Server is installed

There are times when you might want to change the authentication settings for a SQL Server instance. This might occur if you used the default settings during installation to support Windows authentication only and later acquired some software that can only connect using SQL Server authentication. Or possibly you want to make your instance more secure by removing support for SQL Server authentication. The authentication options can be easily changed using the properties page in SSMS shown in Figure 2.

If I wanted to change my instances to support only Windows authentication, all I would need to do is click on the “Windows authentication mode” button in Figure 2, and then click on the “OK” button to apply that change. After making this property change, I would need to restart my instance for this change to take effect.

SQL Server logins

In order to connect to SQL Server, one must have access to SQL Server. Access is granted via a login. A login is also known as a security principal, and is stored in the master database. There is one exception, and that is accessing a contained database. With contained databases users connect directly to database without the need for a login in the master database. More on contained databases in future articles.

There are three types of logins that are stored in the master database: Windows user, Windows group, and SQL. Let’s review each of these different types of logins.

A Windows user login provides access for a single Windows user. When creating this type of login, no password is needed when defining the login in SQL Server. This type of login requires the user to first validate their login by logging into the Windows domain. The Windows domain stores the password.

A SQL Server login is similar to a Windows login in that it provides access to SQL Server for a single user, but it is different than a Windows login because the password for a SQL login is stored in the master database. Therefore, when setting up a SQL Server login, a password needs to be provided for the login along with a few other password options, as shown in Figure 3.

word image 128 SQL Server authentication methods, logins, and database users

Figure 3: Setting up a SQL Server Authenticated Login

Figure 3 shows that a SQL Server Login can be enabled to enforce Windows password policies and expiration and can require a user to change the password upon their first login. Microsoft added these new password features when SQL Server 2005 was released. For applications to support these new password features, they can use the NetValidatePasswordPolicy API.

The last type of login, a Windows group login, is similar to a Windows login but slightly different. A Windows group login provides access to a SQL Server instance for every Windows login that is a member of the group. Windows groups are a great way to provide access to many Windows logins with only having to define a single login in SQL Server. Using a Windows group, access to the SQL Server instance can be maintained by adding or removing group members. Using Windows groups helps minimize security management efforts and troubleshooting login related security issues.

If you look at the bottom of the screenshot in Figure 3, you will notice a “Default Database” setting for a login. The default database setting when creating a login is the “master” database. When setting up a login, the default database can be changed to any database on the server. Best practice is to set the default database to a database that the user will use when connecting to SQL Server.

Windows logins are considered more secure because of the way the password for the login is stored. The password for a Windows login is stored using true encryption. Where as the password for a SQL Login is not encrypted, instead it is hashed. Because a SQL login is hashed makes it easier to crack the password. Windows logins also require domain admin’s to set up the login and password, where as SQL logins the database administrators set up the login and password. By having the domain admin’s managing login passwords provides another layer of security, commonly called separation of duties. By separating the duties of creating and managing Windows logins from managing databases and access to databases, provides an additional security controls to gain access to data stored in SQL Server.

Creating a login to SQL Server allows users to connect to SQL Server. But a login alone doesn’t provide users access to any data in the different databases on the server. For a login to read and/or write data to a database, the login will need access to one or more databases. A login can be set up to have access to many databases on an instance if required.

Database users

A database user is not the same as a login. A login provides to a user or application the ability to connect to a SQL Server instance, whereas a database user provides the login rights to access a database. Each database a login needs access to will require a database user to be defined, except when a login has been given sysadmin rights. When a login has sysadmin rights, they have access to all database, without being mapped to a database user. This association between a login and a database user is known as a user mapping. User mappings for a login can be created during login creation or later for logins that have already been set up.

Creating a database user while creating a new login

To show how to provide user mapping when a new login is created, I will create a new SQL Server login named “Red-Gate”. The screenshot shown in Figure 4 shows the “Login – new” window where I will define this new login. To bring up this window, I expand the “Security” tab under my instance, and then right-click on the “Logins” option and then select the “New Login…” item from the drop-down.

word image 129 SQL Server authentication methods, logins, and database users

Figure 4: Creating Red-Gate Login

In figure 4, I enter “Red-Gate” for the login name and entered the password for this SQL login in the dialog boxes provided. To provide database access for this new login, I click on the “User Mapping” option in the left pane. When I do this, the window in Figure 5 is displayed.

word image 130 SQL Server authentication methods, logins, and database users

Figure 5: User Mapping Window

A red box shows the list of databases in Figure 5, where my new login can be mapped. In order to map my “Red-Gate” login to the “AdventureWorks2019” database, I would just need to click on the “Map” checkbox next to the AdventureWorks2019 database. When I do that, the screenshot in Figure 6 is displayed.

word image 131 SQL Server authentication methods, logins, and database users

Figure 6: Mapping Login to database

After checking the Map box, the name “Red-Gate” is automatically populated in the “User” column for the AdventureWorks2019 database. This interface automatically generates a database user name the same as the login to which it is mapped. Database user names don’t need to be the same as the login. If I wanted my database user name to be different, I could just type over the “Red-Gate” name and specify whatever database user name I wanted. Mapping a login to a database users only provides that login access to the database, but it doesn’t give the login access to read or update data in the database. In future articles I will be discussing how to provide read/write access to database objects.

Suppose I wanted to map my new “Red-Gate” login to additional user databases. In that case, I could do that by just checking on another “Map” checkbox next to the additional databases. For this example, I only want to map my new “Red-Gate” login to the “AdventureWorks2019” database. To finish up mapping my “Red-Gate” login to the “Red-Gate” database user I just need to click on the “OK” button.

Creating a new database user and mapping it to an existing login

There are times when a login has already been created, and it just needs access to one more database. For example, suppose I now wanted my established Red-Gate SQL Server login to access the database named “MyDatabase”. To give the Red-Gate login additional database access, I have a number of options. One option would be to just modify the user mappings by changing the properties on the login. This would be similar to how I just added the user mapping when I created the Red-Gate login.

Another option is to add a new database user to the “MyDatabase” and then mapping that new database user to the Red-Gate login. To create a new user in the database “MyDatabases” I would first expand the database, right-click on the “Security” item, hover over the “New” item, and then click on the “User…” item, as shown in Figure 7.

word image 132 SQL Server authentication methods, logins, and database users

Figure 7: Bring up the new database user dialog

When I click on the new “User…” menu item, the window in Figure 8 is displayed.

word image 133 SQL Server authentication methods, logins, and database users

Figure 8: Adding a new database user

To give the Red-Gate login access to MyDatabase, I need to fill out the form in Figure 8. The first item in Figure 8 to consider is the “User Type”. This field defaulted to “SQL User with Login”. There are four other types: SQL user without login, User mapped to a certificate, User mapped to an asymmetric key, and Window users. Since I am creating a database user that will be mapped to a SQL login, I use the default. Next, I enter the database user name for the user I am creating. It could be any name, but I prefer to make the database user name match the same name as the login it is associated with. Therefore I enter “Red-Gate” in the “User Name” field. I next map my new users to a login. To do the mapping, I can either type in “Red-Gate” for the login or use the ellipsis button (…) to browse through the list of logins already created and select one.

The last item needed is to define a default schema for this login. A schema name is associated with a database object collection owned by a database user. By default, each database has a schema named “dbo” owned by the “dbo” user account. You don’t need to specify a schema when you define a new database user. If one is not specified when defining the database user, the “dbo” schema will be the default schema. Since this article is just a primer, I will discuss the different aspects of schemas. I’ll leave that for another article. When I create my new Red-Gate database user, I’ll leave the default schema item empty and let the create new users process set the default schema automatically to “dbo”.

Once I’ve created my new user, I can verify it exists in the database by expanding the “User” item under the “Security” folder in Object Explorer. You can also create a new database user and map it to a login using a script. Listing 2 is an example of using TSQL to create the same login I just created using the point and click method.

USE [MyDatabase]

GO

CREATE USER [Red-Gate] FOR LOGIN [Red-Gate]

GO

Listing 2: Create a Red-Gate database user using a TSQL script

SQL Server authentication methods, logins, and database users

To connect to SQL Server, a person or process needs to authenticate. There are two different methods to authenticate to SQL Server: Windows and SQL Server. Windows is the more secure and recommended method for connecting to SQL Server. Each connection that authentications to SQL Server gains access to the instance via a login. Logins are defined at the server level. Logins by themselves don’t provide access to data within SQL Server. To gain access to data in a database, a login needs to be mapped to a database user. The authentication methods, logins, and databases users provide the basic security foundations for SQL Server security.

If you liked this article, you might also like Understanding SQL Server Recovery Models.

Let’s block ads! (Why?)

SQL – Simple Talk

Read More

Database trends: Why you need a ledger database

January 18, 2021   Big Data
 Database trends: Why you need a ledger database

The 2021 digital toolkit – How small businesses are taking charge

Learn how small businesses are improving customer experience, accelerating quote-to-cash, and increasing security.

Register Now


The problem: The auto dealer can’t sell the car without being paid. The bank doesn’t want to loan the money without insurance. The insurance broker doesn’t want to write a policy without payment. The three companies need to work together as partners, but they can’t really trust each other.

When businesses need to cooperate, they need a way to verify and trust each other. In the past, they traded signed and sealed certificates. Today, you can deliver the same assurance with digital signatures, a mathematical approach that uses secret keys to let people or their computers validate dates. Ledger databases are a new mechanism for marrying data storage with some cryptographic guarantees.

The use cases

Any place where people need to build a circle of trust is a good place to deploy a ledger database.

  • Crypto currency like Bitcoin inspired the application by creating a software tool for tracking the true owner of every coin. The blockchain run by the nodes in the Bitcoin network is a good example of how signatures can validate all transactions changing ownership.
  • Shipping companies need to track goods as they flow through a network of trucks, ships, and planes. Loss and theft can be minimized if each person along the way explicitly transfers control.
  • Manufacturers, especially those that create products like pharmaceuticals, want to make sure that no counterfeits enter the supply chain.
  • Coalitions, especially industry groups, that need to work together while still competing. The ledger database can share a record of the events while providing some assurance that the history is accurate and unchanged.

The solution

Standard databases track a sequence of transactions that add, delete, or change entries. Ledger databases add a layer of digital signatures for each transaction so that anyone can audit the list and see that it was constructed correctly. More importantly, no one has gone back to adjust a previous transaction, to change history so to speak.

The digital signatures form a chain that links the individual rows or entries. Each signature is constructed to certify the data in the new row and also the data in the previous row. Taken together, all of the signatures added over time certify the sequence that data was added to the log. An auditor can look at some or all of the signatures to make sure they’re correct.

In the case of Bitcoin, the database tracks the flow of every coin over time since the system was created. The transactions are grouped together in blocks that are processed about every ten minutes, and taken together, the chain of these blocks provides a history of the owner of every coin.

Bitcoin also includes an elaborate consensus protocol where anyone can compete to solve a mathematical puzzle and validate the next block on the chain. This ritual is often called “mining” because the person who solves this computational puzzle is rewarded with several coins. The protocol was designed to remove the need for central control by one trusted authority — an attractive feature for some coin owners. It is open and offers a relatively clear mechanism for resolving disputes.

Many ledger databases avoid this elaborate ritual. The cost of competing to solve these mathematical puzzles is quite high because of the energy that computers consume while they’re solving the puzzle. The architects of these systems just decide at the beginning who will be the authority to certify the changes. In other words, they choose the parties that will create the digital signatures that bless each addition without running some competition each step.

In the example from the car sales process, each of the three entities may choose to validate each other’s transactions. In some cases, the database vendor also acts as an authority in case there are any external questions.

The legacy players

Database vendors have been adding cryptographic algorithms to their products for some time. All of the major companies, like Oracle or Microsoft, offer mechanisms for encrypting the data to add security and offer privacy. The same toolkits include algorithms that can add digital signatures to each database row. In many cases, the features are included in the standard licenses, or can be added for very little cost.

The legacy companies are also adding explicit features that simplify the process. Oracle, for instance, added blockchain tables to version 21c of its database. They aren’t much different from regular tables, but they only support inserting rows. Each row is pushed through a hash function, and then the result from the previous row is added as a column to the next row that’s inserted. Deletions are tightly controlled.

The major databases also tend to have encryption toolkits that can be integrated to achieve much the same assurance. One approach with MySQL adds a digital signature to the rows. It is often possible to adapt an existing database and schema to become a ledger database by adding an extra field to each row. If the signature of the previous row is added to the new row, a chain of authentication can be created.

The upstarts

There are hundreds of startups exploring this space. Some are tech companies that are approaching the ledger database space like database developers. You could think of some others as accidental database creators.

It is a bit of a reach to include all of the various crypto currencies as ledger databases in this survey, but they are all managing distributed blockchains that store data. Some, like Ethereum, offer elaborate embedded processing that can create arbitrary digital contracts. Some of the people who are nominally buying a crypto coin as an asset are actually using the purchase to store data in the currency’s blockchain.

The problem for many users is that the cost of storing data depends on the cost of creating a transaction, and in most cases, these can be prohibitive for regular applications. It might make sense for special transactions that are small enough, rare enough, and important enough to need the extra assurance that comes from a public blockchain. For this reason, most of the current users tend to be speculators or people who want to hold the currency, not groups that need to store a constant volume of bits.

Amazon is offering the Quantum Ledger Database, a pay-as-you-go service with what the company calls an “SQL-like API”. All writes are cryptographically sealed with the SHA-256 hash function, allowing any auditor to go through the history to double-check the time of all events. The pricing is based upon the volume of data stored, the size of any indices built upon the data, and the amount that leaves. (It’s worth noting that the word “quantum” is just a brand name. It does not imply that a quantum computer is involved.)

The Hyperledger Fabric is a tool that creates a lightly interconnected version of the blockchain that can be run inside of an organization and shared with some trusted partners. It’s designed for scenarios where a few groups need to work together with data that isn’t shared openly. The code is an open source constellation of a number of different programs, which means that it’s not as easy to adopt as a single database. IBM is one company that’s offering commercial versions, and many of the core routines are open source.

Microsoft’s Blockchain service is more elaborate. It’s designed to support arbitrary digital contracts, not just store some bits. The company offers both a service to store the data and a full development platform for creating an architecture that captures your workflow. The contracts can be set up either for your internal teams or across multiple enterprises to bind companies in a consortium.

BigchainDB is built on the MongoDB NoSQL model. Any MongoDB query will work. The database will track the changes and share them with a network of nodes that will converge upon the correct value. The consensus-building algorithms can survive failed nodes and recover.

Is there anything a ledger can’t do?

Because it’s just a service for storing data, any bits that might be stored in a traditional database can be stored in a ledger database. The cost of updating the cryptographic record for each transaction, though, may not be worth it for many high-volume applications that don’t need the extra assurance. Adding the extra digital signature requires more computation. It’s not a significant hurdle for low-volume tables like a bank account where there may be only a few transactions per day. The need for accuracy and trust far outweigh the costs. But it could be prohibitive for something like a log file of high-volume activity that has little need for assurance. If some fraction of a social media chat application disappeared tomorrow, the world would survive.

The biggest question is just how important it will be to trust the historical record in the future. If there’s only a slim chance that someone might want to audit the transaction journal, then the extra cost of computing the signatures or the hash values may not be worth it.

This article is part of a series on enterprise database technology trends.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Database version control: Getting started with Flyway

January 16, 2021   BI News and Info

“Database migrations made easy” and “Version control for your database” are a couple of headlines you will find on Flyway’s official website. And let me tell you this, those statements are absolutely correct. Flyway is a multi-platform, cross-database version control tool with over 20 supported databases.

From all my years of experience working as an Architect for monolith and cloud-native apps, Flyway is by far the easiest and best tool on the market to manage database migrations.

Whether you are an experienced data professional or starting to get involved in the world of data, this article is the foundation of a series that will get you through this fantastic journey of database migrations with Flyway.

Background history

Flyway was created by Axel Fontaine in early 2010 at Google Code under the Apache 2.0 license. According to Axel’s words, it all started when he searched for a tool that allows integrating application and database changes easily and simply using plain SQL. To his surprise, that kind of tool didn’t exist, and it makes total sense to me because there were not many options back at that time.

Just to get you in context of what I’m talking about in the previous paragraph, everything we know as DevOps today was conceived around 2009. David Farley and Jez Humble released the recognized “Continuous delivery” book in 2010. Therefore, Axel was, without question, a pioneer in deciding to write his own tool to solve this widespread software development problem: Make database changes part of the software deployment process.

Flyway acceptance was great among the developer community, leading to high-speed growth and evolution. For example, the list of supported databases grew, additional support to multiple operating systems was added, and many more features were included from version to version.

The next step in Flyway’s evolution was Pro and Enterprise editions’ launch back in December 2017, which was a smart decision to secure the project’s progression and viability. Without question, Flyway was already the industry-leading standard for database migrations at that time.

Around mid-2019, Redgate Software acquired Flyway from Axel Fontaine. Redgate’s expertise in the database tooling space opens the door to Flyway for new opportunities in expansion, adoption, and once more evolution!

Database migrations

You are probably already familiar with the term Database migration which can mean several different things within the context of enterprise applications. It could mean to move a database from one platform to another or move a database from a previous version of the DBMS engine to the most recent one. Another common scenario these days is moving a database from an on-premises environment to a cloud IaaS, PaaS solution.

This article is not related to any of these practices mentioned above. This article will get you started with database migrations in the context of schema migrations. Yes, this is another kind of database migration which means the practice of evolving a database schema with incremental, reversible, and consistent changes through a simple approach. This approach enables integrating database changes with version control and application deployment processes.

Before digging deeper into this topic, I would like to address the basic requirements of database migrations. Trust me, this topic is fascinating and full of great information that will help you adopt this practice. Whether you are a software developer, database administrator, or solutions architect, understating database development practices like this is essential to become a better professional.

Evolutionary Database Design is the title of an article published on Martin Fowler’s website in May 2006. It is an extract of the Refactoring databases book by Scott Ambler and Pramod Sadalage, also released in 2006. This lecture goes above and beyond explaining the evolution of database development practices through the years, providing techniques and best practices to embrace database changes in software development projects, especially when adopting agile methodologies.

The approach described in this book sets the stage for a collection of best practices that should be followed to be successful.

DBA and developer collaboration

Software development practices like DevOps demand that people with different skills and backgrounds to collaborate closely, knocking down silos and bottlenecks between multiple teams, like the usual separation between development and operations.

In a database development effort, collaboration is crucial to the success of the project. Developers and DBAs should work in harmony, assessing the impact of the database changes proposed before implementing them. Anybody can take the initiative to start the conversations around whether the database code is optimal, secure, and scalable, or simply to make sure it is following best practices.

Version control

Without question, everybody benefits from using version control. All the artifacts that are part of a software project should be included to keep track of the contributor’s individual changes. Starting from the application code, unit and functional tests, database scripts, and even other code types such as build scripts used to create an environment from scratch, known today as Infrastructure as Code.

All databases changes are migrations

All database changes created during earlier stages of the development phase should be captured, no exception. This approach encourages treating database change files like any other artifact of the application, making sure to save and commit these change files to the same version control repository as the application code to be versioned along together.

Migration scripts should include but are not limited to any modification made to your database schema like DDL (Data definition language) and DML (Data manipulation language) changes or data correction changes implemented to solve a production data problem.

Everybody gets their own instance

It is very common for organizations to have shared database environments. This scenario is often a bad idea due to the imminent risk of project delays caused by unexpected resource contention problems. Or, in other cases, delays are caused by interruptions made by the development team itself. A person working on some database objects modified the objects that were part of a last-minute database schema refactoring.

Everyone learns by experimenting with new things. Having a personal workspace where one can endeavor to explore a creative way to solve a problem is excellent! More importantly, being able to work free of interruptions increase productivity.

Leveraging technologies like Docker containers to create an isolated and personal database development environment/workspace seems like a good way to resolve this issue. Other solutions like Windows Subsystem for Linux (WSL) take this approach to a whole new level, providing an additional operating system on top of the Windows workstation.

Leverage continuous integration

Continuous Integration —CI, for short— is a software development practice that consists of merging all changes from a developer’s workspace copy to a specific software branch.

Best practices recommend that each developer should integrate all changes from their workspace into the version control repository at least once a day.

There is a plethora of tools available to set up a continuous integration process like the one recommended above. The one to choose depends on the size of the organization and budget. The most popular are Jenkins, Circle CI, Travis CI, and GitLab.

According to the theory behind this practice, there are few key characteristics a database migration tool should meet:

  • All migrations must have a unique identifier
  • All migrations must be recorded in a migration history table
  • All migrations should be repeatable and reversible

All these practices and characteristics sound attractive to speed up a database development effort. However, the question is: How and what can we use to approach database migrations easily? Worry no more, Flyway to the rescue!

logo company name description automatically gene Database version control: Getting started with Flyway

What is Flyway?

Flyway’s official documentation describes the tool as an open-source database migration tool that strongly favors simplicity and convention over configuration designed to facilitate continuous integration processes for any database on any platform.

Migrations can be written in plain SQL, of course, as explained at the beginning of this article. This type of migrations must follow the specific syntax rules of each database engine such as PL/pgSQL for PostgreSQL, T-SQL for SQL Server, PL/SQL for Oracle, etc.

Flyway migrations can also be manually executed through its command-line client or programmatically using the Java API, Docker containers, or Maven and Gradle plugins.

It supports more than twenty database engines by default. Whether the database is hosted on-premises or cloud environment, Flyway would not have a problem connecting by leveraging the included JDBC driver library shipped with the tool.

Flyway folder architecture

At the time of this writing (December 2020), Flyway’s latest version is 7.3.2. which has the following directory structure:

text description automatically generated Database version control: Getting started with Flyway

* Screenshot is taken from Flyway official documentation

As you can see from the folder structure, it is very straightforward; the documentation is so good that it includes a brief description for some of the folders. Let’s take a look in-depth look and define each one of these folders.

The conf folder is the default location where Flyway will look for the database connectivity configuration. Flyway uses the simple key-value pair approach to set and load specific configurations via the flyway.conf file. I will address the configuration file in detail in future articles; for now, I will stick to this simple definition.

Flyway was written in Java, hence the existence of JRE and lib folders. I strongly recommend leaving those folders alone; any modification to the files within these folders will compromise Flyway’s functionality.

The licenses folder contains the teams, community, and third-party license information in the form of a text file; these three files are available for you if you want to take a look and read all details about each type of license.

The drivers folder is the place where all the JDBC drivers mentioned before can be found in the form of jar files. I believe this folder is worth to be explored in detail to see what is shipped with the tool in terms of database connectivity through JDBC.

I will use my existing Flyway 7.3.2 environment for macOS. I’ll start by verifying my current Flyway version using the flyway -v command:

word image 33 Database version control: Getting started with Flyway

Good, as you can see, I’m on the 7.3.2 version. This is the same version used from the official documentation screenshot that describes the folder structure. Now, I will find the actual folder where Flyway is installed using the which flyway Linux command:

word image 34 Database version control: Getting started with Flyway

Using the command tree -d, I can list all folders inside the Flyway installation path:

a picture containing graphical user interface tex Database version control: Getting started with Flyway

Then I simply have to navigate towards the drivers folder and list all files inside this path using the ls -ll Linux command:

graphical user interface text description automa Database version control: Getting started with Flyway

Look at that long list of JDBC drivers in the form of jar files; right of the box, you can connect to the most popular database engines like PostgreSQL, Microsoft SQL Server, SQLite, Snowflake, MySQL, Oracle, and more.

Following the folder structure, there are the jars and sql folders where you want to store your Java or SQL-based migrations. Flyway will look at these folders by default to automatically discover filesystem (SQL scripts) or Classpath (Java) migrations. Of course, these default locations can be overridden at execution time via a config file and environment variables.

Finally, there are the executable files. As you can see, there are two types: One for macOS/Linux (Flyway) based systems and one for Windows (Flyway .cmd) systems.

How it works

Take a look at the following visual example, where there is an application called Shiny Soft and an empty shell database called Shiny DB. Flyway is installed on the developer’s workstation, where a couple of migrations were created to deploy some database changes.

diagram description automatically generated Database version control: Getting started with Flyway

The first thing Flyway will do when starting this project is to check whether the migration history table exists. This example begins the development effort with an empty shell database. Therefore, Flyway will proceed to create the flyway_schema_history table on the target database called Shiny DB.

a picture containing diagram description automati Database version control: Getting started with Flyway

Right after creating the migration history table, Flyway will scan and apply all available migrations on its default location (jars / sql)

graphical user interface application teams desc Database version control: Getting started with Flyway

Simultaneously, the flyway_schema_history was updated with two new records, one for each of the migrations available (Migration 1 and 2).

This table will contain a high level of detail that will help you to understand better how the database schema is evolving. Take a look at the following example:

word image 35 Database version control: Getting started with Flyway

As you can see, there are two entries. Each has a version, description, type of migration, the script used, and more audit information.

This metadata is valuable and crucial to Flyway functionality. Because it helps Flyway keep track of the actual and future version of your database. And yes, Flyway is also capable of identifying those migrations pending to be applied.

Imagine a scenario where Migration 2 needs to be refactored, creating just one table instead of two. What you want to do is to create a new file called Migration 2.1. This migration will include the DDL instructions to drop the two existing tables and create a new one instead.

Flyway will automatically flag and update this new migration as pending in the flyway_schema_history table; however, it will not apply such migration until you decide to do it.

a picture containing diagram description automati 1 Database version control: Getting started with Flyway

Once Migration 2.1 is applied, Flyway will update the flyway_schema_history table with a new record for the latest migration applied: table description automatically generated Database version control: Getting started with Flyway

Notice the third record that corresponds to the database version 2.1 is not a SQL script. Hence the type column record shows JDBC; instead, this was a Java API type migration successfully applied to perform a database refactoring change.

diagram description automatically generated 1 Database version control: Getting started with Flyway

Advantages

At this point, you should be a little bit more familiar with Flyway. I briefly described what it is and how it works. Now, stop to think about what advantages you will get, including Flyway as the central component of your database deployment management.

In software development, as with everything you do in life, the longer you take to close the feedback loop, the worse the results are. Evolving a monolithic legacy database, where any database change is performed following the state-based database deployment approach, could be challenging. However, choosing the right tool for the job should make your transition to a migration-based deployment easier and painless.

Embracing database migrations with Flyway could not be easier. Whether you choose to start with SQL script-based migrations or Java classes, the learning curve is relatively small. You can always rely on Flyway’s documentation to check, learn, and get guidance on every single command and functionality shipped with the tool out of the box.

You don’t have to worry about keeping a detailed control of all changes applied to your database for starters. All the information from past and future migrations are held with great detail in Flyway’s schema history table. This is not just a simple control table. What I like about this schema history table is the level of detail about every single migration applied to the database. You will be able to identify the type of migration (SQL, Java), who, when, and exactly what was changed in your database.

Another major paint point solved by Flyway is the database schema mismatch. This is a widespread and painful problem encountered when working with different environments like development, test, QA, and production. Recreating a database from scratch, at the same time specifying the exact schema version you want to deploy, is a powerful thing. A database migration tool like Flyway will ensure to apply all those changes that belong to a specific version of your application. Database changes should be implanted with application changes.

Conclusion

This article provides a foundation and detailed explanation of Evolutionary database design techniques and practices required to approach database migrations with tools like Flyway.

I also included a summary of Flyway as a database migration tool, starting from the early days, explaining why and how this tool was born. It finally explored its folder structure and components and provided a visual and descriptive example of how this tool approaches database migrations with ease.

Please join me in the next article series, focusing on explaining how to install Flyway’s command-line tool for Linux/macOS and Windows. Also, explore all details related to its configuration through config files and environment variables.

Let’s block ads! (Why?)

SQL – Simple Talk

Read More

Teradata Named a Cloud Database Management Leader in the 2020 Gartner Magic Quadrant for Cloud Database Management Systems

January 10, 2021   BI News and Info
teradata logo social Teradata Named a Cloud Database Management Leader in the 2020 Gartner Magic Quadrant for Cloud Database Management Systems

Teradata Vantage Also Ranked Highest in Three Out of Four Use Cases in the 2020 Gartner Critical Capabilities for Cloud Database Management Systems for Analytical Use Cases

Teradata (NYSE: TDC), the cloud data analytics platform company, today announced it has been recognized as a Leader in the 2020 Gartner Magic Quadrant for Cloud Database Management Systems, issued November 23, 2020 by analysts Donald Feinberg, Merv Adrian, Rick Greenwald, Adam Ronthal and Henry Cook. Gartner evaluates vendors placed in the Magic Quadrant for Cloud Database Management Systems on completeness of vision and ability to execute following detailed research.

In tandem, Teradata Vantage – the company’s hybrid multi-cloud data analytics software platform – was also recognized with the highest scores in three out of four use cases in the 2020 Gartner Critical Capabilities for Cloud Database Management Systems for Analytical Use Cases, issued November 24, 2020 by analysts Henry Cook, Donald Feinberg, Merv Adrian, Rick Greenwald, and Adam Ronthal. Among the 16 vendors evaluated, Teradata ranked highest in three out of four analytical uses cases – Traditional Data Warehouse, Logical Data Warehouse, and Data Science Exploration/Deep Learning – and ranked second in Operational Intelligence.

“Teradata is committed to providing the best enterprise data analytics in the cloud – period. This means offering our customers a modern data analytics platform that can handle the large and complex workloads that Teradata is known for, and flexible deployment options that don’t limit choice or lock them in,” said Steve McMillan, CEO, Teradata. “We’re the only data warehouse and analytics provider with hybrid multi-cloud offerings across the top three public cloud vendors, providing the same software, features, and experience regardless of environment. This recognition from Gartner validates the strength of our cloud position and further underscores the commitment to our customers to meet them wherever they are on their cloud journey.”

Teradata’s unique hybrid and multi-cloud offerings have become critical differentiators in the Cloud Database Management Systems market. According to Gartner, “The capability to work across intercloud, multicloud and hybrid is increasingly important. This will likely become a prerequisite for these systems.”

Gartner defines the Cloud Database Management Systems (DBMS) as a fully provider-managed public or private cloud software system that manages data in cloud storage. These management systems include specific optimization strategies designed for supporting traditional transactions and/or analytical processing covering one or more of the following seven use cases: Traditional and Augmented Transaction Processing; Traditional and Logical Data Warehouse; Data Science Exploration/Deep Learning; Operational Intelligence; and Stream/Event Processing. Data is stored in a cloud storage tier (e.g., cloud object store, HDFS or other proprietary cloud storage infrastructure), and may use multiple data models — relational, non-relational (document, key-value, wide-column, graph), geospatial, timeseries and others.

Teradata Vantage is the leading hybrid multi-cloud data analytics software platform that enables ecosystem simplification by unifying analytics, data lakes and data warehouses. With Vantage, enterprise-scale companies can eliminate silos and cost-effectively query all their data, all the time, regardless of where the data resides – in the cloud, on multiple clouds, on-premises or any combination thereof – to get a complete view of their business. And now, Teradata is the only data warehouse and analytics provider with hybrid multi-cloud offerings across the top three public cloud vendors, offering maximum flexibility and choice to its enterprise customers.

To learn more visit here.

Gartner Disclaimer
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Let’s block ads! (Why?)

Teradata United States

Read More

Dynamics 365 CE On Premises: Data Files – CRM database autogrowth

January 6, 2021   Microsoft Dynamics CRM

Data Files – CRM database autogrowth

Dynamics 365 CE database usually has 1 data file. If you didn’t change the default configuration, is very probable you have the autogrowth by 1 MB and unlimited. If so, this number is very low and when the file reaches the size and needs to grow, the SQL Server needs to take a little time to increase more 1 MB – it means it will occur very often. In the other hand, let’s suppose you have the autogrowth by 10,240 MB (10 GB) and unlimited. This number is very large and when the file reaches the size and needs to grow, the SQL Server can freeze for a while to take time to increase more 10 GB.

Let’s take a look at the Autogrowth configuration:

autogrowth config Dynamics 365 CE On Premises: Data Files – CRM database autogrowth

According the picture above, we can have the following configuration set:

Autogrowth:

  • Enabled
    • File Growth
      • in Percent
      • in Megabytes
    • Maximum File Sile
      • Limited to (MB)
      • Unlimited
  • Disabled

If you enabled the autogrowth, you need to specify the metrics. You can set to grow in percent or in Megabytes. Also, you can set to grow until the limited value in MB or leave it unlimited (it means the file can grow until the total disk space capacity). Let’s suppose you set your file growth in percent. In this case is pretty common to keep it increasing by 10 percent. It can be very quick if the size of your data file is up to 1 GB (it would increase ~100 MB). But if your data file has 1 TB, 10 percent will increate 100 GB and your SQL Server can be very busy until the file growth finishes.

autogrowth crm Dynamics 365 CE On Premises: Data Files – CRM database autogrowth

Consider changing the configuration of SQL Server data files for MSCRM. One option may be to increase the maximum allocation size manually and disable automatic growth, to prevent SQL Server from taking time to increase 10 GB frequently. Obviously, it requires monitoring to avoid reaching the maximum disk allocation reserved for data files.

Walter Carlin – MBA, MCSE, MCSA, MCT, MCTS, MCPS, MBSS, MCITP, MS

Senior Customer Engineer – Dynamics 365 – Microsoft – Brazil

6661.microsoft Dynamics 365 CE On Premises: Data Files – CRM database autogrowth

Let’s block ads! (Why?)

Dynamics 365 Customer Engagement in the Field

Read More

9 trends in enterprise database technology

December 30, 2020   Big Data
 9 trends in enterprise database technology

Transform 2021

Join us for the world’s leading event about accelerating enterprise transformation with AI and Data, for enterprise technology decision-makers, presented by the #1 publisher in AI and Data

Learn More


The database has always revolved around rock-solid reliability. Data goes in and then comes out in exactly the same way. Occasionally, the bits will be cleaned up and normalized so all of the dates are in the same format and the text is in the same character set, but other than that, nothing should be different.

That consistency is what makes the database essential for any enterprise — allowing it to conduct things like ecommerce transactions. It’s also why the database remains distinct from the data warehouse, another technology that is expanding its mission for slower-twitch things like analysis. The database acts as the undeniable record of the enterprise, the single source of truth.

Now databases are changing. Their focus is shifting and they’re accepting more responsibilities and offering smarter answers. In short, they’re expanding and taking over more and more of the stack.

Many of us might not notice because we’ve been running the same database for years without a change. Why mess with something that works? But as new options and features come along, it makes sense to rethink the architectures of data flows and take advantage of all the new options. Yes, the data will still be returned exactly as expected, but it will be kept safer and presented in a way that’s easier to use.

Many drivers of the change are startups built around a revolutionary new product, like multi-cloud scaling or blockchain assurance. For each new approach to storing information, there are usually several well-funded startups competing to dominate the space and often several others still in stealth mode.

The major companies are often not far behind. While it can take more time to add features to existing products, the big companies are finding ways to expand, sometimes by revising old offerings or by creating new ones in their own skunkworks. Amazon, for instance, is the master at rolling out new ways to store data. Its cloud has at least 11 different products called databases, and that doesn’t include the flat file options.

The other major cloud providers aren’t far behind. Microsoft has migrated its steadfast SQL Server to Azure and found ways to offer a half-dozen open source competitors, like MySQL. Google delivers both managed versions of relational databases and large distributed and replicated versions of NoSQL key/value pairs.

The old standards are also adding new features that often deliver much of the same promise as the startups while continuing support of older versions. Oracle, for instance, has been offering cloud versions of its database while adding new query formats (JSON) and better performance to handle the endless flood of incoming data.

IBM is also moving dB2 to the cloud while adding new features like integration with artificial intelligence algorithms that analyze the data. It’s also supporting the major open source relational databases while building out a hybrid version that merges Oracle compatibility with the PostgreSQL engine.

Among the myriad changes to old database standards and new emerging players, here (in no particular order) are nine key ways databases are being reborn.

1. Better query language

SQL may continue to do the heavy lifting around the world. But newer options for querying — like GraphQL — are making it easier for front-end developers to find the data they need to present to the user and receive it in a format that can be dropped right into the user interface.

GraphQL follows the standard JavaScript format for serializing objects, making it easier for middle- and front-end code to parse it. It also hides some of the complexity of JOINs, making it simpler for end users to grab just the data they need. Developers are already adding tools like Apollo Studio, an IDE for exploring queries, or Hasura, an open source front-end that wraps GraphQL around legacy databases like PostgreSQL.

2. Streaming databases follow vast flows

The model for a standard database is a big ledger, much like the ones clerks would maintain in fat bound books. Streaming databases like ksqlDB are built to watch an endless stream of data events and answer questions about them. Instead of imagining that the data is a permanent table, the streaming database embraces the endlessly changing possibilities as data flows through them.

3. Time-series database

Most database columns have special formats for tracking date stamps. Time-series databases like InfluxDB or Prometheus do more than just store the time. They track and index the data for fast queries, like how many times a user logged in between January 15 and March 12. These are often special cases of streaming databases where the data in the streams is being tracked and indexed for changes over time.

4. Homomorphic encryption

Cryptographers were once happy to lock up data in a safe. Now some are developing a technique called homomorphic encryption to make decisions and answer queries on encrypted data without actually decrypting it, a feature that vastly simplifies cloud security and data sharing. This allows computers and data analysts to work with data without knowing what’s in it. The methods are far from comprehensive, but companies like IBM are already delivering toolkits that can answer some useful database queries.

5. In-memory database

The original goal of a database was to organize data so it could be available in the future, even when electricity is removed. The trouble is that sometimes even storing the data to persistent disks takes too much time, and it may not be worth the effort. Some applications can survive the occasional loss of data (would the world end if some social media snark disappeared?), and fast performance is more important than disaster recovery. So in-memory databases like Amazon’s ElasticCache are designed for applications that are willing to trade permanence for lightning-fast response times.

6. Microservice engines

Developers have traditionally built their code as a separate layer that lives outside the database itself, and this code treats the database as a black box. But some are noticing that the databases are so feature-rich they can act as microservice engines on their own. PostgreSQL, for instance, now allows embedded procedures to commit full transactions and initiate new ones before spitting out answers in JSON. Developers are recognizing that the embedded code that has been part of databases like Oracle for years may be just enough to build many of the microservices imagined by today’s architects.

Jupyter notebooks started out as a way for data scientists to bundle their answers with the Python code that produced it. Then data scientists started integrating the data access with the notebooks, which meant going where the information was stored: the database. Today, SQL is easy to integrate, and users are becoming comfortable using the notebooks to access the database and generate smart reports that integrate with data science (Julia or R) and machine learning tools. The newer Jupyter Lab interface is turning the classic notebook into a full-service IDE, complete with extensions that pull data directly from SQL databases.

7. Graph databases

The network of connections between people or things is one of the dominant data types on the internet, so it’s no surprise that databases are evolving to make it easier to store and analyze these relationships.

Neo4j now offers a visualization tool (Bloom) and a collection of data science functions for developing complex reports about the network. GraphDB is focusing on developing “semantic graphs” that use natural language to capture linguistic structures for big analytic projects. TerminusDB is aimed at creating knowledge graphs with a versioning system much like Git. All of them bring efficiency to storing a complex set of relationships that don’t fit neatly into standard tables.

8. Merging data storage with transport

Databases were once hidden repositories to keep data safe in the back office. Delivering this information to the user was the job of other code. Now, databases like Firebase treat the user’s phone or laptop as just another location for replicating data.

Databases like FaunaDB are baking replication into the stack, thus saving the DBA from moving the bits. Now, developers don’t need to think about getting information to the user. They can just read and write from the local data store and assume the database will handle the grubby details of marshaling the bytes across the network while keeping them consistent.

9. Data everywhere

A few years ago, all the major browsers began supporting the Local Storage and Indexed Storage APIs, making it easier for web applications to store significant amounts of data on the client’s machine. The early implementations limited the data to 5MB, but some have bumped the limits to 10MB. The response time is much faster, and it will also work even when the internet connection is down. The database is not just running on one box in your datacenter, but in every client machine running your code.

VentureBeat

VentureBeat’s mission is to be a digital townsquare for technical decision makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you,
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more.

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Why database folks should care about User Research

November 5, 2020   BI News and Info
 Why database folks should care about User Research

I attended a training session at Redgate this week by Chris Spalton. Chris’ session topic was “An Introduction into Planning User Research.” Chris pointed out that there’s much more to User Research, but that planning the research is particularly important as it is the foundation for everything you do later.

If you’re a database administrator or developer, you may not immediately feel that User Research is related to your job — that’s someone else’s job, right? 

While that used to be true, the world of tech is changing fast. There are reasons you should reconsider.

Understanding your customer is key

One of the most important aspects of Agile development methods, DevOps, and Digital Transformations is a focus on your customer. 

IT folks traditionally have thought of their colleagues as their internal “customers” — DBAs and others used to act more as a sort of internal service provider in a cost center. DevOps and Digital Transformations notably shift this and bring IT into a position to be creative and partner equally with others in the organization to generate value for external customers.

It’s easy to be a bit cynical about this and think, “That’s not the way it is in my job.” If you feel that way, I challenge you to take the initiative to change your own perspective! There is no need for database specialists to wait around for someone to give them permission to start thinking about their organization’s customers and to start being a creative collaborator with others.

Give yourself permission, and very likely it will have a positive impact on your career.

The User Research planning approach is suitable for all sorts of customer interactions

I found that a lot of what I learned about planning User Research is also useful to think about when framing all sorts of conversations:

  • Think carefully about how you define the problem space, as this impacts everything afterward
  • Remember that you are not your customer — and neither are your colleagues. In other words, try to clear your natural bias that you already understand your customer. You don’t!
  • Scope and prioritize your goals for learning carefully, but bring extra questions in case there is time to gain more insights
  • It is useful to gain insights from not only current customers — ex-customers, prospects, and others working in the space may hold valuable insights
  • You won’t cover everything about your research in one session. Plan your research over phases, from initial broad questions, progressing through interaction, showing proposals, and finally testing
  • Plan, prepare for, and structure your calls carefully in a way that respects your interviewee’s time, and allows them freedom to answer honestly and openly

After attending this session, I believe that learning more about User Research will also help me build my skills for other interactions and projects, too.

Participating in User Research connects you to your colleagues

As a DBA or database developer, you may not conduct your own user research. Quite possibly, you already have colleagues who conduct User Research. This may go by different names in different organizations, and the research may be conducted by people in software development, marketing, sales, or customer success teams. 

Keep an eye out to identify where user research is happening in your organization and ask your manager and teammates if they know how and when this works. Reach out to the teams who are doing this research and ask if you are able to join in sometimes, even as an observer, so that you can learn to better understand  your customers. Taking this initiative can open many doors for you and help you bring your career into a new phase.

Let’s block ads! (Why?)

SQL – Simple Talk

Read More

database replication

October 7, 2020   BI News and Info

Database replication is the frequent electronic copying of data from a database in one computer or server to a database in another — so that all users share the same level of information. The result is a distributed database in which users can quickly access data relevant to their tasks without interfering with the work of others. Numerous elements contribute to the overall process of creating and managing database replication.

How database replication works

Database replication can either be a single occurrence or an ongoing process. It involves all data sources in an organization’s distributed infrastructure. The organization’s distributed management system is used to replicate and properly distribute the data amongst all the sources.

Content Continues Below

reg wrapper curl database replication

Overall, distributed database management systems (DDBMS) work to ensure that changes, additions and deletions performed on the data at any given location are automatically reflected in the data stored at all the other locations. DDBMS is essentially the name of the infrastructure that allows or carries out database replication — the system that manages the distributed database, which is the product of database replication.

The classic case of database replication involves one or more applications that connect a primary storage location with a secondary location that is often off site. Today, those primary and secondary storage locations are most often individual source databases — such as Oracle, MySQL, Microsoft SQL and MongoDB — as well as data warehouses that amalgamate data from these sources, offering storage and analytics services on larger quantities of data. Data warehouses are often hosted in the cloud.

data management database replication mobile database replication

Database replication copies data from a principal location.

Database replication techniques

There are several ways to replicate a database. Different techniques offer different advantages, as they vary in thoroughness, simplicity and speed. The ideal choice of technique depends on how companies store data and what purpose the replicated information will serve.

Regarding the timing of data transfer, there are two types of data replication:

  • Asynchronous replication is when the data is sent to the model server — the server that the replicas take data from — from the client. Then, the model server pings the client with confirmation saying the data has been received. From there, it goes about copying data to the replicas at an unspecified or monitored pace.
  • Synchronous replication is when data is copied from the client server to the model server and then replicated to all the replica servers before the client is notified that data has been replicated. This takes longer to verify than the asynchronous method, but it presents the advantage of knowing that all data was copied before proceeding.

Asynchronous database replication offers flexibility and ease of use, as replications happen in the background. However, there is a greater risk that data will be lost without the client’s knowledge because confirmation comes before the main replication process. Synchronous replication is more rigid and time-consuming, but more likely to ensure that data will be successfully replicated. The client will be alerted if it hasn’t, since confirmation comes after the entire process has finished.

There are also several types of database replication based on the type of server architecture. The term leader will be used in these types to mean the same thing as model in the previous asynchronous vs. synchronous examples:

  • Single-leader architecture is one server that receives writes from clients, and replicas draw data from there. This is the most common and classic method. It’s a synchronized method, but somewhat inflexible.
  • Multi-leader architecture is multiple servers that can receive writes and serve as a model for replicas. It is beneficial for when replicas are spread out and leaders must be near all of them to prevent latency.
  • No-leader architecture is every server that can receive writes and serve as a model for replicas. This was pioneered by Amazon’s DynamoDB. While it offers maximum flexibility, it poses challenges for synchronization.

Advantages and disadvantages

Database replication is often overseen by a database or replication administrator. A properly implemented replication system can offer several advantages, including the following:

  • Load reduction. Because replicated data can be spread over several servers, it eliminates the likelihood that any one server will be overwhelmed with user queries for data.
  • Efficiency. Servers that are less burdened with queries can offer improved performance to fewer users.
  • High availability. Employing multiple servers with the same data ensures high availability, meaning that if one server goes down, the entire system can still provide acceptable performance.

Many disadvantages of database replication stem from poor general data governance practices. These disadvantages include the following:

  • Data loss. Data loss can occur during replication when incorrect data or iterations or updates of a database are copied and, consequently, important data is deleted or unaccounted for. This can happen if the primary key used to verify the quality of data in the replica is malfunctioning or incorrect. It can also occur if database objects are incorrectly configured within the source database.
  • Data inconsistency. Similarly, incorrect or out-of-date replicas can cause different sources to be out of sync with each other. This may lead to wasted data warehouse costs that are spent needlessly analyzing and storing irrelevant data.
  • Multiple servers. Running multiple servers has an inherent maintenance and energy cost associated. It requires either the organization or a third party to address these costs. If a third party handles them, the organization runs the risk of vendor lock-in or service issues beyond the organization’s control.

Evolution of database replication

Early instances of database replication were typically described as master-slave configurations, but comparable descriptions today tend to incorporate terminology such as master-replica, leader-follower, primary-secondary and server-client.

Replication techniques centered on relational database management systems have expanded with the advent of the virtual machine and distributed cloud computing, to include nonrelational database types. Again, replication methods vary among such nonrelational databases as Redis, MongoDB and the like.

While remote office database replication may have been the canonical example of replication for many years, fail-safe and fault-tolerant database backup schemes have also arisen as drivers of replication activity — as have horizontally scaling distributed database configurations, both on premises and on cloud computing platforms. Replication details vary between such relational systems as IBM Db2, Microsoft SQL Server, Sybase, MySQL and PostgreSQL.

In all cases, data replication design becomes a balancing act between system performance and data consistency. Database replication can be done in at least three different ways. In snapshot replication, data on one server is simply copied to another server or to another database on the same server. In merging replication, data from two or more databases is combined into a single database. And, in transactional replication, user systems receive full initial copies of the database and then receive periodic updates as data changes.

Database replication vs. mirroring

While data mirroring is sometimes positioned as an alternative approach to data replication, it is actually a form of data replication. In relational database mirroring, complete backups of databases are maintained for use in the case that the primary database fails. Mirrors, in effect, serve as hot standby databases. Data mirroring has found considerable use within the Microsoft SQL Server community.

With database replication, the focus is usually on database scale out for queries — requests for data. Database mirroring, in which log extracts form the basis for incremental database updates from the principal server, is typically implemented to provide hot standby or disaster recovery capabilities. Simply put, mirroring focuses on backing up what’s there, and replication focuses on improving operational efficiency as a whole — which involves maintaining secure data backups using mirroring.

Database replication tools

Companies can either use the database replication tool available offered by their database software provider or invest in third-party replication tools to execute and manage database replication processes. The latter option allows flexibility: Third-party tools are typically vendor-agnostic and can be used to create data replicas across multiple types of databases in an organization.

Database replication software

Third-party database replication tools that work with various databases include the following:

  • Qlik Replicate. A software package that focuses on being easy to learn and implement, Replicate uses automation and log-based capture to minimize IT operations workload. This involves capturing streams of continuous data, which is ideal for companies that need ways to process big data efficiently.
  • Informatica Data Replication. Informatica can target a wide range of database and data warehouse appliances. It offers the Data Engineering product series for streaming, integration, quality and masking enterprise data. Its website features a how-to library and list of guides to help customers.
  • Talend Open Studio for Data Integration. One of the most prominent open source data integration products, it includes a large repertoire of resources for those just getting started with the software. Talend offers tutorials, demos and blog posts on topics like the use of metadata and best practices for data model design best practices. There is also a community of experienced users that new users can reach out to for tips on using Talend’s data integration solution.
  • Quest SharePlex. This offering focuses mainly on Oracle database replication. It offers both on-premises and cloud solutions for Oracle database replication. Shareplex promises high availability, 24/7 customer support and a simple user experience that allows for quick replication and scaling.

Examples of database vendor replication tools include the following:

  • Microsoft’s SQL integration features. Including SQL Server Integration Services (SSIS) for Azure, these tools specialize in cleaning, aggregating, merging and copying data, as well as extracting and loading data.
  • Oracle GoldenGate. This tool offers log-based capture for Oracle databases. It promises simple configuration, extreme performance and comprehensive security. It features a visual management and monitoring feature called the Management Pack as well.
  • IBM’S Db2 SQL replication tool. This tool has two main replication options: Q and SQL. It is one of IBM’s most popular replication tools. It is useful for distributing source data to multiple targets, but may not be ideal for all replication scenarios because it has high latency.

Let’s block ads! (Why?)

SearchSQLServer

Read More

JSonar raises $50 million for AI-powered database monitoring products

June 9, 2020   Big Data

Database monitoring platform startup JSonar today announced it raised $ 50 million, the bulk of which it plans to put toward R&D and go-to-market efforts. Notably, it’s the company’s first-ever round of institutional funding as JSonar’s customer base eclipses 5 out of the world’s 10 largest banks.

Database monitoring is a critical part of most apps’ maintenance. Finding issues in time can help apps remain healthy and accessible, not to mention performant. But the adoption of cloud-based and dynamic, modern data systems make this challenging; legacy data activity monitoring and logging solutions sometimes fail to transform data into actionable insights.

JSonar claims its SonarW and SonarG analytics solutions perform better because they’re automated and AI-powered. The models underpinning them transform petabytes of raw database activity data into security recommendations, with capabilities that go beyond detection and deliver preventative controls. They helpfully funnel these controls and recommendations into existing workflows, and into third-party DevOps services and platforms via prebuilt integrations.

Using SonarW and SonarG, the latter of which is designed for IBM Guardium, users can develop custom analytical algorithms atop large-scale databases and automate reporting and governance processes. According to JSonar, this saves companies months they’d otherwise spend developing proprietary platforms.

VB Transform 2020 Online – July 15-17. Join leading AI executives: Register for the free livestream.

 JSonar raises $50 million for AI powered database monitoring products

Beyond SonarW, JSonar offers Data-Centric Audit and Protection (DCAP) Central, a product that taps the company’s SonarC2 analytics engine to create data lakes capturing, retaining, and facilitating database management. Then there’s DBSec2.0, a fully managed SonarG offering that ingests database audit data and lets customers interact with it directly for security and compliance insights.

The JSonar suite supports virtually any database system on any cloud including infrastructure-as-a-service, platform-as-a-service, and database-as-a-service setups. This includes out-of-the-box support for Amazon Web Services, Google Cloud Platform (GCP), Microsoft Azure, Snowflake, MongoDB, Cassandra, Hadoop, Teradata, and more than 60 others.

The database security and monitoring market is anticipated to be worth $ 7.01 billion by 2022, Markets and Markets reports, and JSonar is far from the only startup attempting to corner it. Among others, there’s SolarWinds, which has been offering management and optimization tools for open systems and databases for years, and Netdata, which makes an open source database monitoring tool.

But JSonar founder and CTO Ron Bennatan, former cofounder of Guardium Database Activity (which IBM acquired in November 2009), believes his company’s reliance on automation sets it apart from the rest. “The rapidly shifting enterprise landscape, including cloud adoption, an explosion of database platforms, the pressing need for data security beyond only compliance, and years of frustration over runaway costs, has created a huge opportunity for us to rapidly expand,” he told VentureBeat via email. “Traditional database security solutions have proven to be too costly to be used broadly and provide little beyond a checkbox; the modern data landscape demands a new approach.”

Goldman Sachs led the investment in JSonar. As a part of the transaction, Goldman Sachs managing director David Campbell will join the startup’s board of directors.

Sign up for Funding Weekly to start your week with VB’s top funding stories.

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More
« Older posts
  • Recent Posts

    • The Easier Way For Banks To Handle Data Security While Working Remotely
    • 3 Ways Data Virtualization is Evolving to Meet Market Demands
    • Did you find everything you need today?
    • Missing Form Editor through command bar in Microsoft Dynamics 365
    • I’m So Excited
  • Categories

  • Archives

    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited