Benefits of Cloud-Based Database Applications

Depositphotos_26627079_x_thumbI still remember the week I realized that the cloud would most likely live up to its hype. It was a couple of years ago at the PASS Summit, and Microsoft had shifted their marketing message from being all-in on the cloud to a more pragmatic hybrid approach. In many of my offline interactions that week, I heard story after story of successes in moving some workloads to the cloud, resulting in significant cost savings and greater flexibility. That was the week that turned this skeptic into a believer that the cloud is the real deal, and it’s here to stay.

Although not every application is a perfect fit for cloud deployment, the fact is that cloud-based services are rapidly becoming the standard for new development. Platform as a Service (PaaS) vendors are also making it very easy to move existing on-premises applications to the cloud through myriad tools for data and metadata migration.

Benefits of Cloud-Based Database Applications

As the PaaS offerings from vendors such as Amazon and Microsoft have matured over the last few years, the benefits of moving workloads to the cloud have become clearer.

Decreased cost. Let’s say you decide to run your own standalone Microsoft SQL Server machine. Assuming you’re buying modest hardware, you’re going to be out $5,000 for the box, not to mention the periphery (network equipment, UPS units, etc.) required to set up a server and keep it connected and running. Then there’s SQL Server licensing, which for Standard Edition only was around $4,000 per core as of this writing. Assuming your modest server is a quad-core machine, you’re at $16,000 just for SQL Server licensing. If you plan to have redundant hardware to account for high availability or disaster recovery, expect to double (or more) these fixed costs. Also, keep in mind that servers don’t live forever; plan to repeat this purchase every 3-5 years. In contrast, the recurring cost model of PaaS applications reduces the steep capital expenditure into a more finely tuned pay-as-you-go economy, often resulting in lower total cost.

Time to market. Think about the things that will slow down your development or deployment timeline. It takes time – sometimes weeks or months – just to get approval for a capital expenditure for the hardware and software. After that, the new machine has to be built by the manufacturer, delivered, and set up in your data center. Install the operating system, configure it on your network, and then you can finally install SQL Server. In the absolute best case, this is a weeks-long process; in reality, going from zero to on-premises SQL Server will take a month, perhaps much longer. Building a PaaS database can be completed in minutes rather than weeks or months; in fact, earlier today I built a SQL Database in Azure in about the same length of time it took me to type out this paragraph. If time is a factor (and really, when is it not?), the speed at which a new cloud database can be created is a very compelling argument for using a PaaS architecture.

Managed services. If you’re hosting your own database server for a 24×7 application, your organization is on the hook for handling things like operating system hiccups, hardware failures, network snafus, and power outages. Keeping resources and staff to handle these issues takes time, training, and most importantly, money. Add to all of that the ongoing environmental costs of running a data center: cooling, power, backup power apparatus, and physical security, among others. By using a cloud platform to develop and run your database applications, you’ll shift the responsibility of data center management over to your cloud vendor, saving a great deal of effort and money. You’ll also rest a bit easier, since software updates and service SLAs will also be handled by the PaaS host.

Easy scalability. With your on-premises database server, what happens in six months when you discover that the hardware can no longer keep up with demand? You could upgrade the server hardware (assuming it’s got headroom for that) or just buy a new larger server. Similarly, I’ve seen a few cases of buyer’s remorse where the needs were overstated and the server hardware was much more robust (and therefore, more expensive) than it needed to be. Overbuying or underbuying hardware is always costly. Conversely, when building a database application in the cloud, it is relatively easy to set – and later change, if needed – the resources allocated to it. As the processing and storage needs change, the resources can be scaled up or down in seconds with just a few mouse clicks.

Conclusion

Platform as a Service offerings have evolved and improved significantly in the past couple of years. These cloud services make a strong argument for cost savings, time, and flexibility for both new development and migrated applications. As the volume of customers migrating to the cloud continues to increase, prices will continue to fall, furthering the cost benefit advantage of cloud-based applications over their on-premises counterparts.

When Data Warehouse Projects Fail

When Data Warehouse Projects Fail

Getting a data warehouse over the finish line is hard. Data warehouses are complex organisms, requiring intense collaboration and technical expertise across every level of the organization. When it all comes together, it’s a beautiful thing. However, many data warehouse initiatives never make it to user acceptance.

On my technical blog, I have cataloged some of the reasons I’ve found that data warehouses fail. Avoiding these pitfalls can reduce the possibility of a data warehouse project going off the tracks. But things can still go wrong even with the best planning and adherence to best practices.

When Data Warehouse Projects Fail

The odds are good that something will go wrong during every data warehouse implementation: the due date for a deliverable gets pushed out, a dimension table has to be refactored, the granularity of a fact table changes. If things slide to the point that the project is no longer moving forward, it is critical to respond properly to get the project moving positively again if possible.

Triage

First and foremost, focus on determining status and next steps. Is the project truly ended, or has it just stalled? That distinction will drive most of the rest of the decisions made. If there is a design or development impasse, the road forward will look very different if the project has been shelved due to budget cutbacks or other factors. Assess if there is working room to salvage the project. If yes, use that time to isolate and minimize the speed bumps that slowed down the project the first time.

When triaging, don’t be afraid to issue an “all-stop” directive while you reassess next steps. However, don’t let the project founder in that state for long. Figure out what went wrong, fix it, and move forward.

Take inventory

Regardless of whether or not the project is salvageable, take stock of the individual deliverables and the respective status of each. If the project has simply stalled but the plug has not been pulled, having a clearly identified status of the technical and nontechnical assets will make the process of restarting the project far easier. If the project is not salvageable, there is almost certainly some business value in the work that has already been completed. Properly classifying and archiving those assets can provide a jump start for related initiatives in the future.

Communicate, communicate, communicate

Whether it’s a stalled project or one that has been stopped entirely, timely communication is essential to managing expectations. Clearly communicate the status of the project, what to expect next, and any timelines. Make sure that everyone involved – business analysts, executives, technical staff, and other stakeholders – is clear on the status and timeline. Don’t cast blame here; keep the updates fact-based and simple.

Renew the focus as a business project

For stalled data warehouse projects, it is important to refresh the focus of the project. Data warehouses should always be driven by business teams, not technical teams. Because of the technical complexities required of data warehouse projects, it is common to lose focus and steer data warehouse projects as technical initiatives. Although technical architecture is critical, the business stakeholders should be the ones driving the design and deliverables.

Scale back on deliverables

Of the reasons I’ve found that data warehouse projects fail, trying to do too much in one iteration is a common factor. Big-bang data warehouse projects don’t leave much flexibility for design changes or refactoring after user acceptance. If a stalled or failed data warehouse has many concurrent development initiatives, consider cutting back on the initial set of deliverables and deploying in phases. This can add overall time to the schedule, but you get a better product.

Bring in a hired gun

Insourcing your data warehouse project is often the right solution: you aren’t spending money on external help, you don’t lose that project-specific knowledge when the project is done, and your team gains the experience of building out the technical and nontechnical assets of the project. However, if a data warehouse project has stalled, bringing in the right partner to get back on track can help to save the project, and save time and money in the long run.

Conclusion

Like any technical project, data warehouse initiatives can stall or even fail. If this happens, it is important to properly set the project back on track, or wind it down as gracefully as possible if the project has been abandoned.

Upcoming Webinars on SQL Server Reporting Services 2016

With last year’s release of SQL Server, those of us who trade in the reporting and analytics space got a big gift: A brand new version of SQL Server Reporting Services. Unlike prior versions, SSRS 2016 brought many new much-needed improvements that immediately improved the usability and relevance of Reporting Services as a reporting and analytics tool. For the first time in many years, SQL Server Reporting Services is exciting again!

For the next three weeks, we will be hosting weekly webinars to demonstrate the new features of Reporting Services 2016. Please join us for the following webinars:

Introduction to the Report Portal in SSRS 2016 (Tuesday, January 24th)

Introduction to KPIs in SSRS 2016 (Tuesday, January 31st)

Introduction to Mobile Reports in SSRS 2016 (Tuesday, February 7th)

We look forward to seeing you there!

Keeping Data Quality Simple

Keeping data quality simpleA universal truth in business is that bad data costs money, and can even be dangerous. Data quality processes are often an afterthought rather than a central component of the architecture, due in part to the fear over the complexity of checking and cleansing data quality. In many cases, that fear is warranted; the longer data quality is delayed (in time as well as in data lineage), the more time and money it costs to make it right.

Some data quality needs require a formal project and specialized tools to address. Other issues, can be mitigated or avoided entirely with small changes to the data model or load processes. We like to look for these “low-hanging fruit” opportunities when designing a data or ETL infrastructure, because some of these simple modifications can save dozens of hours (or more) and significantly improve data quality very quickly.

Keeping Data Quality Simple

Among the passive changes that can help prevent or fix data quality issues:

Use of proper data types. Many of the data quality issues we fix are related to numerical data stored as text, and then some non-numerical data ends up inadvertently loaded. Even harder to detect is when numerical data of an incorrect precision is stored. The same goes for date values, date + time values, geolocation data, among others. Storing data in the type that most closely represents its real use will avoid a lot of downstream problems that are often hard to diagnose and expensive to fix.

Non-nullable fields. Every major RDBMS platform supports non-nullable fields, which require a value in said field before an insert or update operation will complete. If a particular field must have a value before that record can be considered valid, marking the column as non-nullable can avoid data consistency issues where that value is missing.

Foreign key validation. The use of foreign keys for data validation is a best practice in most any relational database architecture, and that is doubly true when improving data quality is a main objective. Using foreign keys to limit values to only those entries explicitly allowed in the table referenced by the foreign key prevents stray values from sneaking into a constrained field.

Check constraints. Preventing the insertion of data with values outside a define range can be accomplished through check constraints found in every major database platform. These, like foreign keys, limit the values that can be entered for a column but the check constraint does not use a separate lookup table. Also, you have flexibility to set a range of allowable entries rather than a discrete list of values. An example of this would be using a check constraint to enforce that all dates of birth are on or after a certain date.

ETL cleanup. Most ETL tools have built-in functionality allowing for lightweight data cleansing. Assuming the data in question is being processed through a structured ETL tool, adding in logic to correct minor data quality issues is relatively easy to do with low risk. Emphasize a light touch here – you aren’t going to want to do address standardization or name deduplication without some formal codified process.

No Substitute for Formal Data Quality Processes

Even when taking these precautions to prevent or correct some issues, you’ll still run into cases where a more rigid and comprehensive data quality initiative will be needed. The above suggestions will not eliminate the need for proper data quality tooling, but can help reduce the pain from or delay the need for in-depth data quality remediation.

Conclusion

Data quality requires a multifaceted strategy. Taking care of some of the simple problems with easy-to-use tools already at your fingertips can have a significant and immediate impact on the quality of your data.

Data Warehousing: It’s About The Business

Data warehousing is about the businessData warehouses are complex creations. The ETL and data cleansing processes that sanitize and reshape the data, the relational database in which the data warehouse resides, the auditing routines verifying that the data is correct, and the reporting and analytics tools that sit atop the entire structure all come together to make an intricate but immensely valuable business asset. In fact, most of the effort and time on the project schedule are focused on the technical components of a data warehouse solutions.

As with any development project, careful attention must be paid to using best practices in putting together the bits for the solution. However, with data warehouse projects, too often the focus becomes the design and behavior of the technology. Focusing just on the technical aspects of a data warehouse solution is effective for heads-down coding of specific pieces, but does not work for managing the project as a whole.

When building a data warehouse, there is one critical point that must always be kept front of mind:

A data warehouse is a business initiative, not a technical one.

I’ve seen data warehouse initiatives go off track when the focus shifted away from the business needs. Through every aspect of the project – initial brainstorming, technical design, testing, validation, delivery, and support – the needs of the business must drive every decision made. While the data warehouse will be built using memory, disks, code, tables, and ETL processes, the primary goal of the project must remain clear: The data warehouse exists to answer business questions about the data. Anything contradictory to that is a barrier and must be removed.

When architecting a data warehouse solution, build it using the best technical design possible. But in all design decisions, remember the ultimate goal and audience of the final product. Data warehousing is about the business, not the technology.

Partnership with Straight Path Solutions

I am happy to announce that Tyleris Data Solutions is partnering with Straight Path Solutions to assist our clients with data needs that extend beyond business intelligence and data warehousing. Straight Path will be our go-to resource for solutions that require deep expertise in database tuning and high availability in addition to data warehousing, ETL, and reporting needs that Tyleris already services.

Data challenges are rarely singularly faceted; very often solutions are needed across multiple architecture realms. Because Tyleris has always strived to remain a deep-and-narrow focus in data warehousing and closely related topics, we have chosen to partner with Straight Path to help us help our clients who need unmatched expertise in SQL Server database engine management. This partnership enables Tyleris to help with a broader set of problems without sacrificing depth of knowledge or focus of our core offerings.

Straight Path is the right partner for us for a couple of different reasons. Personally, I have worked with Mike Walsh at Straight Path for years, and his integrity, workmanship, and attention to client needs have always greatly impressed me. Like Tyleris, Straight Path doesn’t try to be a “do-everything” data provider, instead focusing on those areas where they have deep expertise and can add a tremendous amount of business value.

Mike also posted an announcement of this new partnership on the Straight Path blog earlier today sharing some of his perspective on this. I am very excited about this partnership with Straight Path Solutions, and I look forward to working with Mike and the team.

DevConnections 2016

I am excited to share that I will be presenting at the DevConnections conference in Las Vegas in October of this year. This year I will present one workshop and two regular presentations:

Building Better SSIS Packages  – Full-day workshop (Monday, October 10)

Making the Most of the SSIS Catalog (Tuesday, October 11)

Change Detection in SQL Server (Thursday, October 13)

I’ll be spending most of the week in the city, so if you’re attending let me know! I’d be happy to meet up and chat.

Data Warehouse: On-Premises or Cloud?

I’ve been fielding this question a lot these days: “We’re building a data warehouse – should we build it here or in the cloud?” It’s a fair question, but it’s not the question that should be asked. The more appropriate question is this: “What part of our data warehouse solution should be in the cloud, and how does it work together with our on-premises data?”

Data Warehouse: On-Premises or Cloud?I shared a few of my thoughts on this topic a few weeks ago in a podcast interview with Carlos Chacon, when we discussed whether or not the on-premises data warehouse was dead. Without spoiling all of the details of that conversation, my short answer is that the on-premises data warehouse is alive and well but is no longer the only DW option.

As recently as three years ago, the cloud was still relatively new and not yet widely in use in most organizations. At the same time, companies selling cloud services were in the midst of a massive marketing effort to direct customers to the cloud. Microsoft famously declared themselves to be all-in on cloud well before the market was ready to follow. Many IT leaders and technologists bristled at the thought of being forced into the cloud at the expense of tried-and-true on-premises solutions.

However, in the past couple of years the message from cloud providers has softened. No more is it “cloud or bust”. Rather, cloud services companies – and Microsoft in particular – have reshaped the message to one in which the cloud is just one piece of a heterogeneous architecture that may include on-prem, PaaS, IaaS, and SaaS solutions. At the same time, consumers are realizing the value of cloud-based solutions for some of their architecture. Although I rarely have a client that wants to build an all-cloud infrastructure, most everyone I work with is at least exploring if not actively using cloud services for a portion of their data systems.

Cloud services are here to stay. No, the cloud absolutely will not take over on-premises data storage and processing. Rather, cloud offerings will be one more option for managing data and the code around it. So the question is not whether you should be in the cloud – the answer is yes (or it soon will be). The more practical question is how to best leverage cloud services as part of a hybrid strategy to minimize development time and total cost of ownership.

 

This post originally appeared in the Data Geek Newsletter.

Introducing the Pinch Hit Service

I am happy to announce the launch of a new service designed to help with very short term consultation needs. Although most consulting engagements are weeks or months in duration, we’ve discovered that some client needs are simple and do not require a traditional consulting approach. In response to this need, Tyleris has created the Pinch Hit service as a simple, no-commitment, 2-hour remote consultation.

The Pinch Hit was created to assist clients who are handling their own data warehousing, ETL, and reporting infrastructure. They may be looking for a second set of eyes to look at a problem, assistance with troubleshooting a specific problem, or a focused training session. Much like the use of a pinch hitter in baseball, Tyleris brings a specialized skillset to help deal with a clutch situation.

Not every business or technical need is suitable for this service, but in cases where the problem domain is narrow, the Pinch Hit can deliver outstanding value in a short time. If you find yourself in need of a Pinch Hit engagement from Tyleris, just let us know how we can help.

Request a Pinch Hit

 

 

 

SSIS Training for Spring 2016

If you are looking for SSIS classroom training, we’ve got several exciting opportunities for you this spring and summer! Tim Mitchell will be delivering full-day introductory and intermediate-level presentations in Baltimore, Orlando, and Baton Rouge later this year.

Saturday, April 16th in Orlando, FL: Introduction to SSIS

Friday, April 22nd in Orlando, Fl: Building Better SSIS Packages

Friday, April 29th in Baltimore, MD: Building Better SSIS Packages

Friday, August 5th in Baton Rouge, LA: Building Better SSIS Packages

Each of these events is open now for registration; you can participate by following any of the links above. As always, contact us with any questions.