Cheap and Secure Cloud Backups

I’ve wanted to find a good provider of cheap and secure cloud backups for a while. I’ve compared some cloud drive providers, but didn’t quite like those. They usually have very limited free plans, somewhat pricey paid plans (e.g. 50GB for about 24$ a year for OneDrive), or like in the case of Google no information available at all. By the way, “Google one is coming soon” isn’t an announcement that I want to look at for more than a few days when looking for pricing info.

Then, I’ve looked at pricing of cloud storage providers, such as AWS, Azure and Google Cloud. Those offer storage around 1 cent ($0.01) per GB per month. That’s a quarter of the OneDrive cost! It’s even less if you consider their archive offerings (AWS Glacier, Archive in Azure, Coldline Storage for Google). The cheapest offering here is from Microsoft at 0.2 cents ($0.002) per GB per month, but with some usage caveats. Since the point of backups is to keep them for a long time, this quickly adds up though.

Now I’ve written a line or two of code before, so I figured I could as well write my own tool for this. So here is bart, the backup and restore tool. Note that at this point I do not offer bart as a ready-to-use executable, but only as MIT-licensed source code. In addition, bart currently works only with Azure Blob Storage – or with storage mounted into the machine’s file system. However, adding other cloud providers/archive destinations should be relatively easy, given the interfaces used in the tool.

Security

In terms of security, bart encrypts every file before storing it in the archive destination. A user-provided password is used together with a randomly generated salt to derive a key for encryption with AES. On first use of any archive destination, bart generates a random salt, and each archive has its own password and salt. To avoid anybody with access to the archive destination from even snooping the names of your files, the names are hashed (SHA1) and the hashes used to store the encrypted files. This has the disadvantage that renaming/moving a file results in another file in the destination archive, though.

Usage

Once you compiled bart, you can use it as follows.

./bart [-name string] [-path string] [-m noop|restore|delete] -acct string -key string
  -name string
        The name of the backup archive. (default "backup")
  -path string
        The path to the directory to backup and/or restore. (default ".")
  -m string
        A behavior for files missing locally: 'noop' to do nothing, 'restore' to restore them from the backup, 'delete' to delete them in the backup archive. (default "noop")
  -acct string
        The Azure Storage Account name.
  -key string
        The Azure Storage Account Key.

Sources

The sources are on GitHub @ https://github.com/rokeller/bart.

Conclusion

I’ve used bart for backup of some photos/videos for a while now. For the about 42GB I have uploaded so far my monthly bill from Microsoft is about 42 cents ($0.42). Those months where I upload new files the cost is a little higher (a few cents usually) because of the extra transactions. My backed up files are encrypted. If this isn’t cheap and secure cloud backups, what is?

Lucene.Net.ObjectMapping for .Net Standard 2.0

It’s been a long time since I’ve done some work on my Lucene.Net.ObjectMapping library. Recently I accepted a pull request that added support for the 4.8 beta releases of Lucene.Net itself, but when I involuntarily needed to updated one of my services to bring it up to speed with running in a Docker container, I decided that it was about time to update Lucene.Net.ObjectMapping for .Net Standard 2.0. The last time I used the library in a Docker container, ASP.NET vNext RC1 was just about to become final. so that’s a long time ago. Accordingly, there was quite a bit of work to understand the changes needed: both in .Net (and ASP.NET) between the 1.0 RC1 and the .Net Standard 2.0 releases, and also between the Lucene.Net 3.x and 4.8 releases. Luckily, the latter was largely taken care of by the pull request for the library itself. The former however proved a bit challenging. After all, the toolset has changed significantly.

Updated Sources

To cut a long story short, the updated sources are now available on GitHub. I decided to track it in a separate branch for better isolation. This new branch is aptly called netstandard. I’ll try to stay up-to-date with the more recent releases of Lucene.Net, and also with .Net Standard 2.0. That is, provided that I find the time for it. You may notice that the project files have become quite a bit simpler. That’s certainly one change in .Net Standard and Core that I welcome. The other is the better integration of Nuget for package referencing and package creation/pushing.

Updated Unit Tests

As a side effect, I also figured that it was going to be easier to update NUnit to the latest version, since its toolset is also well integrated with the new dotnet toolset. Since I’m doing all changes through VSCode and with building/testing/packaging in Docker containers based on the microsoft/aspnetcore-build:2 images, I wanted to keep it simple. The good thing here is that the dotnet toolset seems to offer really everything I need for this, and is suprisingly easy to handle, especially when compared to the RC1 version.

Updated Nuget Package

As I’ve mentioned in the beginning, I primarily made this effort because I needed a newer version of Lucene.Net with compatibility for .Net Standard 2.0. As a result, I published a new RC build as a Nuget package too. It is built on the latest Lucene.Net 4.8 beta release and currently supports only .Net Standard 2.0. If there’s a great demand for it, I’ll see if I can add support for other targets – or accept pull requests accordingly.

Conclusions

Nothing much besides the obvious: .Net Standard seems to be in a good shape wrt libraries and toolset, as well as support on Linux. There are a few gotchas but overall nothing much of a problem. Lucene.Net is still somewhat badly documented itself, and the tracking of braking changes between major/minor versions (and in fact also revisions/beta releases of the same major/minor) could be greatly improved. An online documentation would be very useful – maybe it exists, and I just haven’t found it? In any case, skimming through the Lucene.Net sources on GitHub works too, though being much slower.

You can find more information about object mapping for Lucene.Net on the Lucene.Net.ObjectMapping page.

Offline JSON Pretty Printing

Today when you’re dealing with Web APIs, you often find yourself in the situation of handling JSON, either in the input for these APIs or in the output, or both. Some browsers have the means to pretty print the JSON from their dev tools. But you don’t always have that opportunity. That’s why there are tools to pretty print JSON. I’ve found quite a few of them on the web, but all the ones I’ve found have one terrible flaw: they actually send the JSON you’re trying to pretty print to the server (*shudder*). I don’t want my JSON data (sensitive or not) to be sent to some random servers!

All your JSON are belong to us!

Now as I wrote, I don’t particularly like the fact that my JSON data is sent over the wire for pretty printing. It may not be super secret or anything, but in these days, you cannot be careful enough. Besides, it’s completely unnecessary to do it. All you need is already in your browser! So I quickly built my own JSON pretty printer (and syntax highlighter). You can find it right here.

Offline JSON Pretty Printing to the Rescue

Actually, the design is very simple. All my JSON pretty printer is doing, is to take your JSON input and try to parse it as JSON in the browser.

JSON.parse(yourJsonInput)

If that fails, I’m showing the parsing error and it’s done. If it succeeds, I get back a JavaScript object/array/value, which then I’m inspecting. For objects, I’m using basic tree navigation to go through all the properties and nested objects/arrays/values for pretty printing. That’s it, really simple. No need to transmit the data anywhere — it stays right in your browser!

So like it, hate it, use it or don’t: cymbeline.ch JSON Pretty Printer

LINQ with Lucene.Net.ObjectMapping

Last time I mentioned that I started to work on supporting LINQ with Lucene.Net.ObjectMapping. That includes LINQ queries like the following:

using (Searcher searcher = new IndexSearcher(directory))
{
    IQueryable<BlogPost> posts =
        from post in searcher.AsQueryable<BlogPost>()
        where obj.Tag == "lucene"
        orderby obj.Timestamp descending
        select post;
}

Now granted, the above example is a very basic one. So here’s a short list of other methods on IQueryable<T> that are already supported at this point: Any *, Count *, First *, FirstOrDefault *, OrderBy, OrderByDescending, Single *, SingleOrDefault *, Skip, Take, ThenBy, ThenByDescending, and finally Where.

* Method is supported both with and without a filter predicate.

With this, it becomes easy to build paging based on objects you get back as a result of a query on Lucene.Net. I’m still working on improving the supported filter expressions (most of all for Where, but all the other filterable methods naturally profit too). For instance, with the default JSON-based object mapping it is already possible to search for entries in a dictionary that maps a string to another property or object. Say you have a set of classes, defined as follows.

public class MyClass
{
    public int Id { get; set; }
    public Dictionary<string, MyOtherClass> Map { get; set; }
}

public class MyOtherClass
{
    public string Text { get; set; }
    public int Sequence { get; set; }
    public DateTime Timestamp { get; set; }
}

Now you can actually search for instances of MyClass that satisfy certain conditions in the Map dictionary, like this:

var query = from c in searcher.AsQueryable<MyClass>()
            where c.Map["MyKey"].Sequence == 123
            select c;

Since the items in the dictionary are mapped to analyzed fields in the Lucene.Net document, we can search on them!

Delete and Update By Query

Now since I have this query expression binder to create Lucene.Net queries based on LINQ filter expressions, I’ve added an extension method to update and one to delete documents that match a query. So it is now possible to do this:

indexWriter.Delete<MyClass>(x => x.Id == 1234);
indexWriter.Update(myObject, x => x.Id == myObject.Id);

Call to Action

Now with all this said, I’m looking for volunteers to help me get more coverage on the LINQ queries, because that’s definitely where the weak spot is right now. If you’re interested, leave a comment here or on GitHub.