Microsoft Azure : How-to setup a site-to-site VPN using OpenSwan (on a Telenet SOHO subscription)

Objective of the day?

We’ll be setting up an IPSec VPN tunnel between Microsoft Azure and a development/management environment using commodity internet connection of a Belgian ISP.

Azure-Site_to_Site_VPN

What will our test environment look like?

  • Private Network : 192.168.0.0/24
  • System running Openswan : 192.168.0.226
  • Private Internet Connection : 81.82.83.84
  • Azure VPN Gateway : 104.40.149.247
  • Test System on Azure : 10.0.0.4
  • Azure Network : 10.0.0.0/24

The steps we’ll be going through?

 

  • Configure Virtual Network on Azure
  • Configure VPN Gateway
  • Configure Openswan
  • Configure NAT Rules on the ISP (Telenet) Router
  • Activate IPSec VPN Tunnel
  • Test Connectivity

Continue reading “Microsoft Azure : How-to setup a site-to-site VPN using OpenSwan (on a Telenet SOHO subscription)”

Database variants explained : SQL or NoSQL? Is that really the question?

A first glance beyond the religion

When taking a look towards the landscape of databases, one can only accept that there has been a lot of commotion about “SQL vs NoSQL” in the last years. But what is it really about?

SQL, which stands for “Structured Query Language”, has been around since the seventies and is commonly used in relational databases. It consists of a data definition language to define the structure and a data manipulation language to alter the data within the structure. Therefore a RDBMS will have a defined structure and has been a common choice for the storage of information in new databases used for financial records, manufacturing and logistical information, personnel data, and other applications since the 1980s.

1401269083847

NoSQL, which stands for “Not only SQL”, departs from the standard relational model since it saw its first introduction in the nineties. The primary focus of these database was performance, or a given niche, and focus less consitency/transactions. These databases provide a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling, and finer control over availability. The data structures used by NoSQL databases (e.g. key-value, graph, or document) differ from those used in relational databases, making some operations faster in NoSQL and others faster in relational databases. The particular suitability of a given NoSQL database depends on the problem it must solve.

So it depends on your need…

Do you want NoSQL, NoSQL, NoSQL or NoSQL?

NoSQL comes in various flavors. The most common types of NoSQL databases (as portrayed by Wikipedia) ;

There have been various approaches to classify NoSQL databases, each with different categories and subcategories. Because of the variety of approaches and overlaps it is difficult to get and maintain an overview of non-relational databases. Nevertheless, a basic classification is based on data model. A few examples in each category are:

  • Column: Accumulo, Cassandra, Druid, HBase, Vertica
  • Document: Clusterpoint, Apache CouchDB, Couchbase, MarkLogic, MongoDB, OrientDB
  • Key-value: Dynamo, FoundationDB, MemcacheDB, Redis, Riak, FairCom c-treeACE, Aerospike, OrientDB
  • Graph: Allegro, Neo4J, InfiniteGraph, OrientDB, Virtuoso, Stardog
  • Multi-model: OrientDB, FoundationDB, ArangoDB, Alchemy Database, CortexDB

Column

A column of a distributed data store is a NoSQL object of the lowest level in a keyspace. It is a tuple (a key-value pair) consisting of three elements:

  • Unique name: Used to reference the column
  • Value: The content of the column. It can have different types, like AsciiType, LongType, TimeUUIDType, UTF8Type among others.
  • Timestamp: The system timestamp used to determine the valid content.

Example

{
    street: {name: "street", value: "1234 x street", timestamp: 123456789},
    city: {name: "city", value: "san francisco", timestamp: 123456789},
    zip: {name: "zip", value: "94107", timestamp: 123456789},
}

Document

A document-oriented database is designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data. The central concept of a document-oriented database is that Documents, in largely the usual English sense, contain vast amounts of data which can usefully be made available. Document-oriented database implementations differ widely in detail and functionality. Most accept documents in a variety of forms, and encapsulate them in a standardized internal format, while extracting at least some specific data items that are then associated with the document.

Example

<Article>
   <Author>
       <FirstName>Bob</FirstName>
       <Surname>Smith</Surname>
   </Author>
   <Abstract>This paper concerns....</Abstract>
   <Section n="1"><Title>Introduction</Title>
       <Para>...
   </Section>
 </Article>

Key-Value

A key-value (an associative array, map, symbol table,or dictionary) is an abstract data type composed of a collection of key/value pairs, such that each possible key appears just once in the collection.

Example

{
    "Pride and Prejudice": "Alice",
    "The Brothers Karamazov": "Pat",
    "Wuthering Heights": "Alice"
}

Graph

A graph database is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A graph database is any storage system that provides index-free adjacency. This means that every element contains a direct pointer to its adjacent elements and no index lookups are necessary. General graph databases that can store any graph are distinct from specialized graph databases such as triplestores and network databases.

Example

GraphDatabase_PropertyGraph

MultiModel

Most database management systems are organized around a single data model that determines how data can be organized, stored, and manipulated. In contrast, a multi-model database is designed to support multiple data models against a single, integrated backend. Document, graph, relational, and key-value models are examples of data models that may be supported by a multi-model database.

And what flavor do I want?

Each type and implementation has its own advantages… The following chart from Shankar Sahai provides a good overview ;

nosql-comparison-table

Any other considerations I should take into account?

Be wary that most implementations were not designed around consistency integrity and more towards performance. Transactions are referential integrity are not supported by most implementations. High availability designs (including on geographic level) are possible with some implementations, though this often implies a performance impact (as one would expect).

Also check out the research made by Altoros ;

5. Conclusion
As you can see, there is no perfect NoSQL database. Every database has its advantages and disadvantages that become more or less important depending on your preferences and the type of tasks.
For example, a database can demonstrate excellent performance, but once the amount of records exceeds a certain limit, the speed falls dramatically. It means that this particular solution can be good for moderate data loads and extremely fast computations, but it would not be suitable for jobs that require a lot of reads and writes. In addition, database performance also depends on the capacity of your hardware.

They did a very decent job in performance testing various implementations!

2015-01-21 09_08_23-A_Vendor_independent_Comparison_of_NoSQL_Databases_Cassandra_HBase_MongoDB_Riak.

Web Development : A step up with Automated Deployment

Developing a website… ; Open up “notepad++”, browse to your web server via FTP and edit the files. Then refresh to see the changes…

Sounds familiar? Probably… It’s a very straight forward and easy process. The downside however is that you have no tracking of your changes (Version Control) and that the process is pretty manual. So this becomes a problem when you aren’t the only one on the job or if something goes wrong.

So let’s step it up and introduce “version control”… Now we have an overview of all the revisions we made to our code and we are able to revert back to it. Yet suddenly, we need to do a lot more to get our code onto the web server. This brings us to the point where we want a kind of helper that does the “deployment” for us.

The basic process
automated-web-development-kvaes.be

  • Local Development : The development will happen here. Have fun… When you (think you) are happy with what you have produced, you update the files via your version system.
  • Source Repository : The source repository will contain all the versions of your code. Here you can configure it to send a notification to your deployment system whenever a new version has been introduced.
  • Deployment System : The deployment system will query the source repository and retrieve the latest code. This code will be packaged, transmitted and deployed onto the target system(s).
  • Target Systems : The systems that will actually host your code and deliver the (web) service!

Real Life Example?
Ingredients

Recipe

  • Create a private repository at BitBucket
  • Pull/push the repository between BitBucket & your local SourceTree
  • In GitHub, go to “Settings”, “Deployment Keys” and generate a key for your automation. Copy it to your clipboard…
    2015-01-12 15_33_53-kvaes _ kvaes.be - 2015 _ Admin _ Deployment keys — Bitbucket
  • In DeployHQ, go to “Settings”, “General Settings” and copy to key into the “Public Key Authentication” textbox.
    2015-01-12 15_31_19-Website 2015 - LogiTouch - Deploy
  • In DeployHQ, go to “Settings”, “Servers & Group” and create a new server.
    2015-01-12 15_36_53-Website 2015 - LogiTouch - Deploy
  • In the same screen, Enable “Auto Deploy” and copy the url hook.
    2015-01-12 15_38_19-Website 2015 - LogiTouch - Deploy
  • Now go to “Settings” in GitHub, and then “Hooks”. Add a “POST” hook containing the url hook you just copied.
    2015-01-12 15_39_11-kvaes _ kvaes.be - 2015 _ Admin _ Hooks — Bitbucket
  • Now every time you do a commit on your workstation, the code will be deployed to your server!

In fact, this is the mechanism I utilize for my own (hobby) development projects. An example of here, is my own homepage, which is deployed via the system as described above.

The DTAP-Street : a phased approach to a development / deployment cycle

The acronym DTAP finds its origin in the words Development, Testing, Acceptance and Production. The DTAP-street is a commonly accepted method to have a phased approach to software development / deployment.

A typical flow works as follows :

  • Development – This environment is where the software is developed. It is the first environment that is used. Changes are very frequent here, as this is the first area where creativity is forged into a product.
  • Test – A developer is (hopefully) not alone. In the test environment, the complete code base is merged and forged into one single product. The first attempts at standardization and alignment towards the future production environment are made here.
  • Acceptation – Once the development team feels that the product is ready, it will be deployed to acceptance. This is a look-alike of the production and used by operations as a staging environment for production releases.
  • Production – The real deal… Here the product surely needs to be ready for prime-time.

dtap-kvaes.be-271014

Sometimes the following are also added ;

  • Education / Training – Sometimes a dedicated environment is needed where people can test drive the software in a safe sand box. Due to efficiency reasons, this environment is often time shared with acceptation.
  • Backup / Disaster Recovery – Disasters can happen… Therefore some disaster recovery plans may rely on a dedicated backup / disaster recovery location.
  • Integration – An environment that is sometimes located between “Test” & “Acceptance” as an intermediate step to test certain partner integrations. Just as with the “eduction” environment, this environment is often time shared with acceptation.

What are the most commonly used formations?

  • Live – Production – Many companies rely solely on a production environment. The risk reduction is often neglected in favor of the cost benefit of having one environment.
  • Staging – Production/Test – If no real customization are done to the implemented software, then two environments may suffice.
  • DTAP – Development/Test/Acceptation/Production – Once customization hit… then a full DTAP-street is needed to reduce the amount of risks involved with software development.
  • DTAPB – Development/Test/Acceptation/Production/Backup – This is an enhanced DTAP-street that is capable of doing a disaster recovery. (Sidenote ; The Test/Development environment is often shared with the backup location. This provides the advantage that the resources of the Test/Development can be sacrificed during a disaster.)

What Code / Data flows occur between the environments?

  • Software Versions – Software releases go from Development to Test to Acceptation to Production… The timing varies from the chose release management cycle, though typical times are as follows ; Development (Continuous Builds), Test (Daily Build), Acceptation (Once per quarter, three weeks before production), Production (Once per quarter)
  • Data – Data flows in the opposite direction as software versions. Data is taken from production and copied to Acceptance / Test / Development. Depending on the environment (and relative security compliancy), the data may be anonymized or even reduced to have a representative production workload of a limited size.

Lingo Explained : Technical Debt

What is technical debt?

A design or construction approach that’s expedient in the short term but that creates a technical context in which the same work will cost more to do later than it would cost to do now (including increased cost over time)

technical-debt-explained

Example

“Guys, we don’t have time to dot every i and cross every t on this release. Just get the code done. It doesn’t have to be perfect. We’ll fix it after we release.”

A quote from the past

“As an evolving program is continually changed, its complexity, reflecting deteriorating structure, increases unless work is done to maintain or reduce it.” — Meir Manny Lehman, 1980

Need a bit more info? Check out the presentation on technical debt at the International Conference on Software Engineering anno 2013 or Wikipedia.

Devops : What kind of animal is it, and what dogfood does(n’t) it eat.

Wikipedia
Where I believe from that cornerstone of my hear in this role, I often see it misinterpreted or even abused… So today we’ll talk about the “devops”-animal, and what it should be doing. Let’s first take a glance at Wikipedia (as it’s always a nice reference) ;

DevOps (a portmanteau of development and operations) is a software development method that stresses communication, collaboration and integration between software developers and information technology (IT) professionals. DevOps is a response to the interdependence of software development and IT operations. It aims to help an organization rapidly produce software products and services.

So the essence of the role (and yes, it is a role, not an animal, sorry to disappoint you) lies as a bridging function between “Development” & “Operations”. This statement is backed up in the Wikipedia article too ;

DevOps is frequently described as a more collaborative and productive relationship between development teams and operations teams. This improved relationship and collaboration increases efficiency and reduces the production risk associated with frequent changes.

The role of a DevOps professional has similarities to that of a Chief Engineer within the Toyota Production System. Such persons have responsibilities for the project’s success, but no formal authority over different teams involved. This requires technical knowledge in order to convince managers of the needs. Company executives can make convincing the managers more effective by formally endorsing the role of the Chief Engineer.

Many organizations divide Development and System Administration into different departments. While Development departments are usually driven by user needs for frequent delivery of new features, Operations departments focus more on availability, stability of IT services and IT cost efficiency. These two contradicting goals create a “gap” between Development and Operations, which slows down IT’s delivery of business value.

What does it bring?
The article visualizes the role as following ;
400px-Devops.svg
Where this might seem a bit different from the “bridging role” as just mentioned, imaging the “Quality Assurance” as the role you often see with your Architect(s). When we go to the stereotypes of each departement; we see that development is typically the team that wants to “inovate”, where operations is the team that wants the least amount of change in order to provide the needed stability. The devops will take both worlds into account. (S)he will introduce things like “High Availability”, “Scalibility”, “Performance”, “Security”, “Data Integrity”, “Monitoring”, and so on from the Operations world. On the other side, (s)he’ll also introduce versioning, automation, release management, and so on from the development world.

So what isn’t it?
Think about the following pitfalls ;

  • Devops is not a team…
  • Devops does not live within one technology team. (S)he stretches accross boundaries!
  • It’s also not the chinese volunteer that will be scr*wed with all your releases.

I think with those statements a lot of companies their interpretation of “Devops” went down the drain…

Still confused?
Still not sure what it is? Check Patrick Debois’s slideshare presentation on Devops, as that one is spot on!

Software Development Methods

Introduction
Lately I’ve been noticing that it’s not that common to understand the different software development methodologies that are being used in the field. So this post is to provide you with a quick overview.

“A software development methodology or system development methodology in software engineering is a framework that is used to structure, plan, and control the process of developing an information system.” Source : Wikipedia

Waterfall (Traditional)
The waterfall development model originates in the manufacturing and construction industries; highly structured physical environments in which after-the-fact changes are prohibitively costly, if not impossible. Since no formal software development methodologies existed at the time, this hardware-oriented model was simply adapted for software development. The waterfall model is a sequential design process, often used in software development processes, in which progress is seen as flowing steadily downwards (like a waterfall) through all the phases.
Waterfall_model_(1).svg
More info : http://en.wikipedia.org/wiki/Waterfall_model

RUP (Iterative)
The Rational Unified Process (RUP) is an iterative software development process framework created by the Rational Software Corporation, a division of IBM since 2003. RUP is not a single concrete prescriptive process, but rather an adaptable process framework, intended to be tailored by the development organizations and software project teams that will select the elements of the process that are appropriate for their needs.
Development-iterative
More info : http://en.wikipedia.org/wiki/Rational_Unified_Process

RAD (Spiral)
Rapid application development (R.A.D) is a software development methodology that uses minimal planning in favor of rapid prototyping. The “planning” of software developed using RAD is interleaved with writing the software itself. The lack of extensive pre-planning generally allows software to be written much faster, and makes it easier to change requirements.
RADModel
More info : http://en.wikipedia.org/wiki/Rapid_application_development

XP (Agile)
Extreme Programming (XP) is a software development methodology which is intended to improve software quality and responsiveness to changing customer requirements. As a type of agile software development, it advocates frequent “releases” in short development cycles (timeboxing), which is intended to improve productivity and introduce checkpoints where new customer requirements can be adopted.
XP-feedback
More info : http://en.wikipedia.org/wiki/Extreme_Programming

SCRUM
Scrum is an iterative and incremental agile software development framework for managing software projects and product or application development. Scrum focuses on project management institutions where it is difficult to plan ahead. Mechanisms of empirical process control, where feedback loops that constitute the core management technique are used as opposed to traditional command-and-control management. Its approach to planning and managing projects is by bringing decision-making authority to the level of operation properties and certainties.
Scrum_process.svg
More info : http://en.wikipedia.org/wiki/Scrum_%28development%29
(Another method named “Kanban” is oftenly associated with SCRUM, yet there are subtle differences!)

Closing Thoughts
Be aware that there is NO silver bullet… Each method has its own advantages and disadvantages. Be sure to study each one before choosing the right on for YOUR team. It’s not because SCRUM worked with a given team for project Z, that it will for another team that’ll code project Y. (Be aware of cargo cult!)