Cloud Data migration is not new to ISVs. Data conversion from legacy systems during new implementations is common and so is the data migration during major upgrades. ISVs adopt either manual methods or proprietary tools or resort to sophisticated ETL tools for this important step of a customer implementation project. However things change drastically when a customer is moving to cloud and the on premise data needs to be moved to the cloud. ISVs need to know and prepare for the following in their data migration solutions for their customers moving from on premise to cloud.
Storage selection: This happens actually much before the data migration and needs to be considered while designing the cloud architecture of their products. Cloud vendors offer multiple storage options. For example take S3, Simple DB and RDS offered by Amazon. While RDS is good for web apps, S3 is suitable for large media files and Simple DB is the best option for query-able light weight attribute data. Selecting the right storage simplifies data migration later during customer on boarding.
Data location: On premise data migrations allow you to specify the target storage and path. However, in the case of the cloud, unless you ask for it you have no control over where your data will reside. Hence while selecting a cloud vendor it is important to check whether they allow and facilitate this choice. Customers might have compliance reasons to insist on data location within a region and not having a strategy in place could mean loss of business.
Data selection: It is critical to plan for the data to be migrated. There could be multiple reasons for keeping part of the data on premise. These could be data sensitivity, archived data or a hybrid cloud model. What are really important are the data extraction strategy and the tools which can apply the filters e.g. attribute based, selection table based or aggregation function in SQL to extract the appropriate data.
Multi tenant data transformation: Degree of multi tenancy plays an important role in data migration strategy. If it is separate database tenancy migration, it’s simple. If it is shared database but separate schema most of the ETL tools support it. But the challenge comes when it is a shared schema model tenant id which has to be passed at the record level. Run of the mill ETL tools will not support this and the ISV needs to prepare a solution well in advance before moving to shared schema multi tenancy.
Data loading strategies: There are multiple options like direct loading, Simple APIs or workflow based APIs. Simple APIs check for data model failures or non compliance but in a multi tenant cloud there could be tenant specific data model extensions and APIs need to work with these. Similarly workflow based APIs run pre determined workflows and throw up pre defined errors in case of deviations. One needs to develop these APIs before hand as a part of the migration strategy.
Migration verification: The standard method of on premise migration verification is to verify whether all attributes of selected objects are migrated. This poses challenges in cloud. Unlike on premise relational targets cloud storage do not yet support strong SQL querying capabilities. So the standard top down, bottom up equivalence or bottom-up fingerprint techniques do not work as is. And thus for verification, custom solutions need to be built before hand. While this post does not cover specific solutions, ISVs would benefit to use these as a checklist in their cloud migration strategy.
By Milind Khirwadkar, AVP cloud services, Symphony Services
Symphony Services is a leading global specialist providing software product engineering outsourcing services. The company’s focus on Engineering Outcome Certainty™ drives R&D results that shorten time-to-market for new products and delivers greater innovation to compete in a global marketplace. Independent software vendors (ISVs), software enabled businesses and companies whose products contain embedded software partner with Symphony Services to achieve their business goals.