setting zpool features

Gibheer
2014-12-10
13-40

Before SUN was bought by Oracle, OpenSolaris had ever newer versions and upgrading was just an

$ zpool upgrade rpool

away. But since then, the open source version of ZFS gained feature flags.

POOL  FEATURE
---------------
tank1
      multi_vdev_crash_dump
      enabled_txg
      hole_birth
      extensible_dataset
      embedded_data
      bookmarks
      filesystem_limits

If you want to enable only one of these features, you may have already hit the problem, that zpool upgrade can only upgrade one pool or all.

The way to go is to use zpool set. Feature flags are options on the pool and can also be listed with zpool get.

$ zpool get all tank1 | grep feature
tank1  feature@async_destroy          enabled                        local
tank1  feature@empty_bpobj            active                         local
tank1  feature@lz4_compress           active                         local
tank1  feature@multi_vdev_crash_dump  disabled                       local
...

Enabling a feature, for example multi_vdev_crash_dump, would then be

$ zpool set feature@multi_vdev_crash_dump=enabled tank1

It will then disappear from the zpool upgrade output and be set to enabled active in zpool get.

using unbound and dnsmasq

Gibheer
2014-12-09
22-13

After some time of using an Almond as our router and always having trouble with disconnects, I bought a small apu1d4, an AMD low power board, as our new router. It is now running FreeBSD and is very stable. Not a single connection was dropped yet.

As we have some services in our network, like a fileserver and a printer, we always wanted to use names instead of IPs, but not a single router yet could provide that. So this was the first problem I solved.

FreeBSD comes with unbound preinstalled. Unbound is a caching DNS resolver, which helps answer DNS queries faster, when they were already queried before. I wanted to use unbound as the primary source for DNS queries, as the caching functionality is pretty nice. Further I wanted an easy DHCP server, which would also function as a DNS server. For that purpose dnsmasq fits best. There are also ways to use dhcpd, bind and some glue to get the same result, but I wanted as few services as possible.

So my setup constellation looks like this:

client -> unbound -> dnsmasq
             +-----> ISP dns server

For my internal tld, I will use zero. The dns server is called cerberus.zero and has the IP 192.168.42.2. The network for this setup is 192.168.42.0/24.

configuring unbound

For this to work, first we configure unbound to make name resolution work at all. Most files already have pretty good defaults, so we will overwrite these with a file in /etc/unbound/conf.d/, in my case /etc/unbound/conf.d/zero.conf.

server:
  interface: 127.0.0.1
  interface: 192.168.42.2
  do-not-query-localhost: no
  access-control: 192.168.42.0/24 allow
  local-data: "cerberus. 86400 IN A 192.168.42.2"
  local-data: "cerberus.zero. 86400 IN A 192.168.42.2"
  local-data: "2.42.168.192.in-addr.arpa 86400 IN PTR cerberus.zero."
  local-zone: "42.168.192.in-addr.arpa" nodefault
  domain-insecure: "zero"

forward-zone:
  name: "zero"
  forward-addr: 127.0.0.1@5353

forward-zone:
  name: "42.168.192.in-addr.arpa."
  forward-addr: 127.0.0.1@5353

So what happens here is the following. First we tell unbound, on which addresses it should listen for incoming queries. Next we staate, that querying dns servers in localhost is totally okay. This is needed to later be able to resolve addresses on the local dnsmasq. If your dnsmasq is running on a different machine, you can leave this out. With access-control we allow the network 192.168.42.0/24 to query the dns server. The next three lines tell unbound, that the name cerberus and cerberus.zero are one and the same machine, the DNS server. Without these two lines unbound would not resolve the name of the local server, even if its name would be stated in /etc/hosts. With the last line we enable name resolution for the local network. The key domain-insecure tells unbound, that this domain has no support for DNSSEC. DNSSEC is enabled by default on unbound.

The two forward-zone entries tell unbound, where it should ask for queries regarding the zero tld and the reverse entries of the network. The address in this case points to the dnsmasq instance. In my case, that is running on localhost and port 5353.

Now we can add unbound to /etc/rc.conf and start unbound for the first time with the following command

$ sysrc local_unbound_enable=YES && service local_unbound start

Now you should be able to resolve the local hostname already

$ host cerberus.zero
cerberus.zero has address 192.168.42.2

configuring dnsmasq

The next step is to configure dnsmasq, so that it provides DHCP and name resolution for the network. When adjusting the config, please read the comments for each option in your config file carefully. You can find an example config in /usr/local/etc/dnsmasq.conf.example. Copy it to /usr/local/etc/dnsmasq.conf and open it in your editor:

port=5353
domain-needed
bogus-priv
no-resolv
no-hosts
local=/zero/
except-interface=re0
bind-interfaces
local-service
expand-hosts
domain=zero
dhcp-range=192.168.42.11,192.168.42.200,255.255.255.0,48h
dhcp-option=option:router,192.168.42.2
dhcp-option=option:dns-server,192.168.42.2
dhcp-host=00:90:f5:f0:fc:13,0c:8b:fd:6b:04:9a,sodium,192.168.42.23,96h

First we set the port to 5353, as defined in the unbound config. On this port dnsmasq will listen for incoming dns requests. The next two options are to avoid forwarding dns requests needlessly. The option no-resolv avoids dnsmasq knowning of any other dns server. no-hosts does the same for /etc/hosts. Its sole purpose is to provide DNS for the local domain, so it needn’t to know.

The next option tells dnsmasq for which domain it is responsible. It will also avoid answering requests for any other domain.

except-interfaces tells dnsmasq on which interfaces not to listen on. You should enter here all external interfaces to avoid queries from the wide web detecting hosts on your internal network. The option bind-interfaces will try to listen only on the interfaces allowed instead of listening on all interfaces and filtering the traffic. This makes dnsmasq a bit more secure, as not listening at all is better than listening.

The two options expand-hosts and domain=zero will expand all dns requests with the given domain part, if it is missing. This way, it is easier to resolv hosts in the local domain.

The next three options configure the DHCP part of dnsmasq. First is the range. In this example, the range starts from 192.168.42.11 and ends in 192.168.42.200 and all IPs get a 48h lease time. So if a new hosts enters the network, it will be given an IP from this range. The next two lines set options sent with the DHCP offer to the client, so it learns the default route and dns server. As both is running on the same machine in my case, it points to the same IP.

Now all machines which should have a static name and/or IP can be set through dhcp-host lines. You have to give the mac address, the name, the IP and the lease time. There are many examples in the example dnsmasq config, so the best is to read these.

When your configuration is done, you can enable the dnsmasq service and start it

$ sysrc dnsmasq_enable=YES && service dnsmasq start

When you get your first IP, do the following request and it should give you your IP

$ host $(hostname)
sodium.zero has address 192.168.42.23

With this, we have a running DNS server setup with DHCP.

common table expressions in postgres

Gibheer
2014-10-13
21-45

Four weeks ago I was askes to show some features of PostgreSQL. In that presentation I came up with an interesting statement, with which I could show nice feature.

What I’m talking about is the usage of common table expressions (or short CTE) and explain.

Common table expressions create a temporary table just for this query. The result can be used anywhere in the rest of the query. It is pretty useful to group sub selects into smaller chunks, but also to create DML statements which return data.

A statement using CTEs can look like this:

with numbers as (
  select generate_series(1,10)
)
select * from numbers;

But it gets even nicer, when we can use this to move data between tables, for example to archive old data.

Lets create a table and an archive table and try it out.

$ create table foo(
  id serial primary key,
  t text
);
$ create table foo_archive(
  like foo
);
$ insert into foo(t)
  select generate_series(1,500);

The like option can be used to copy the table structure to a new table.

The table foo is now filled with data. Next we will delete all rows where the modulus 25 of the ID resolves to 0 and insert the row to the archive table.

$ with deleted_rows as (
  delete from foo where id % 25 = 0 returning *
)
insert into foo_archive select * from deleted_rows;

Another nice feature of postgres is the possibility to get an explain from a delete or insert. So when we prepend explain to the above query, we get this explain:

                            QUERY PLAN
───────────────────────────────────────────────────────────────────
 Insert on foo_archive  (cost=28.45..28.57 rows=6 width=36)
   CTE deleted_rows
     ->  Delete on foo  (cost=0.00..28.45 rows=6 width=6)
           ->  Seq Scan on foo  (cost=0.00..28.45 rows=6 width=6)
                 Filter: ((id % 25) = 0)
   ->  CTE Scan on deleted_rows  (cost=0.00..0.12 rows=6 width=36)
(6 rows)

This explain shows, that a sequence scan is done for the delete and grouped into the CTE deleted_rows, our temporary view. This is then scanned again and used to insert the data into foo_archive.

range types in postgres

Gibheer
2014-08-08
20-23

Nearly two years ago, Postgres got a very nice feature - range types. These are available for timestamps, numerics and integers. The problem is, that till now, I didn’t have a good example what one could do with it. But today someone gave me a quest to use it!

His problem was, that they had id ranges used by customers and they weren’t sure if they overlapped. The table looked something like this:

create table ranges(
  range_id serial primary key,
  lower_bound bigint not null,
  upper_bound bigint not null
);

With data like this

insert into ranges(lower_bound, upper_bound) values
  (120000, 120500), (123000, 123750), (123750, 124000);

They had something like 40,000 rows of that kind. So this was perfect for using range type queries.

To find out, if there was an overlap, I used the following query

select *
  from ranges r1
  join ranges r2
    on int8range(r1.lower_bound, r1.upper_bound, '[]') &&
       int8range(r2.lower_bound, r2.upper_bound, '[]')
 where r1.range_id != r2.range_id;

In this case, int8range takes two bigint values and converts it to a range. The string [] defines if the two values are included or excluded in the range. In this example, they are included. The output for this query looked like the following

 range_id │ lower_bound │ upper_bound │ range_id │ lower_bound │ upper_bound
──────────┼─────────────┼─────────────┼──────────┼─────────────┼─────────────
        2 │      123000 │      123750 │        3 │      123750 │      124000
        3 │      123750 │      124000 │        2 │      123000 │      123750
(2 rows)

Time: 0.317 ms

But as I said, the table had 40,000 values. That means the set to filter has a size of 1.6 billion entries. The computation of the query took a very long time, so I used another nice feature of postgres - transactions.

The idea was to add a temporary index to get the computation done in a much faster time (the index is also described in the documentation).

begin;
create index on ranges using gist(int8range(lower_bound, upper_bound, '[]'));
select *
  from ranges r1
  join ranges r2
    on int8range(r1.lower_bound, r1.upper_bound, '[]') &&
       int8range(r2.lower_bound, r2.upper_bound, '[]')
 where r1.range_id != r2.range_id;
rollback;

The overall runtime in my case was 300ms, so the writelock wasn’t that much of a concern anymore.

learning the ansible way

Gibheer
2014-08-08
19-13

Some weeks ago I read a blog post about rolling out your configs with ansible as a way to learn how to use it. The posts wasn’t full of information how to do it, but his repository was a great inspiration.

As I stopped using cfengine and instead wanted to use ansible, that was a great opportunity to further learn how to use it and I have to say, it is a really nice experience. Apart from a bunch configs I find every now and then, I have everything in my config repository.

The config is split at the moment between servers and workstations, but using an inventory file with localhost. As I mostly use freebsd and archlinux, I had to set the python interpreter path to different locations. There are two ways to do that in ansible. The first is to add it to the inventory

[hosts]
localhost

[hosts:vars]
ansible_connection=local
ansible_python_interpreter=/usr/local/bin/python2

and the other is to set it in the playbook

- hosts: hosts
  vars:
    ansible_python_interpreter: /usr/local/bin/python2
  roles:
    - vim

The latter has the small disadvantage, that running plain ansible is not possible. Ansible in the command and check mode also needs an inventory and uses the variables too. But if they are not stated there, ansible has no idea what to do. But at the moment, it isn’t so much a problem. Maybe that problem can be solved by using a dynamic inventory.

What I can definitely recommend is using roles. These are descriptions on what to do and can be filled with variables from the outside. I have used them bundle all tasks for one topic. Then I can unclude these for the hosts I want them to have, which makes rather nice playbooks. One good example is my vim config, as it shows how to use lists.

All in all I’m pretty impressed how well it works. At the moment I’m working on a way to provision jails automatically, so that I can run the new server completely through ansible. Should make moving to a new server in the fututre much easier.

playing with go

Gibheer
2014-04-04
22-39

For some weeks now I have been playing with Go, a programming language developed with support from google. I’m not really sure yet, if I like it or not.

The ugly things first - so that the nice things can be enjoyed longer.

Gos package management is probably one of the worst points of the language. It has an included system to load code from any repository system, but everything has to be versioned. The weird thing is that they forgot to make it possible to pin the dependencies to a specific version. Some projects are on the way to implement this feature, but it will probably take some time.

What I also miss a shell to test code and just try stuff. Go is a language which is compiled. I really like it for small code spikes, calculations and the like. I really hope they will include it sometime in the future, but I doubt it.

With that comes also a very strict project directory structure, which makes it nearly impossible to just open a project and code away. One has to move into the project structure.

The naming of functions and variables is strict too. Everything is bound to the package namespace by default. If the variable, type or function begins with a capital letter, it means that the object is exported and can be used from other packages.

// a public function
func FooBar() {
}

// not a public function
func fooBar() {
}

Coming from other programming languages, it might be a bit irritating and I still don’t really like the strictness, but my hands learned the lesson and mostly capitalize it for me.

Now the most interesting part for me is, that I can use Go very easily. I have to look for much of the functions, but the syntax is very easy to learn. Just for fun I built a small cassandra benchmark in a couple of hours and it works very nice.

After some adjustments it even ran in parallel and is now stressing a cassandra cluster for more than 3 weeks. That was a very nice experience.

Starting a thread in Go is surprisingly easy. There is nothing much needed to get it started.

go function(arg2, arg2)

It is really nice to just include a small two letter command to get the function to run in parallel.

Go also includes a feature I wished for some time in Ruby. Here is an example of what I mean

def foo(arg1)
  return unless arg1.respond_to?(:bar)
  do_stuff
end

What this function does is test the argument for a specific method. Essentially it is an interface without a name. For some time I found that pretty nice to ask for methods instead of some weird name someone put behind the class name.

The Go designers found another way for the same problem. They called them also interfaces, but they work a bit differently. The same example, but this time in Go

type Barer interface {
  func Bar()
}

func foo(b Bar) {
  do_stuff
}

In Go, we give our method constraint a name and use that in the function definition. But instead of adding the name to the struct or class like in Java, only the method has to be implemented and the compiler takes care of the rest.

But the biggest improvement for me is the tooling around Go. They deliver it with a formatting tool, a documentation and a test tool. And everything works blazingly fast. Even the compiler can run in mere seconds instead of minutes. It actually makes fun to have such a fast feedback cycle with a compiled language.

So for me, Go is definitely an interesting but not perfect project. The language definition is great and the tooling is good. But the strict and weird project directory structure and project management is currently a big problem for me.

I hope they get that figured out and then I will gladly use Go for some stuff.

no cfengine anymore

Gibheer
2014-03-16
10-51

I thought I could write more good stuff about cfengine, but it had some pretty serious issues for me.

The first issue is the documentation. There are two documents available. One for an older version but very well written and a newer one which is a nightmare to navigate. I would use the older version, if it would work all the time.

The second issue is that cfengine can destroy itself. cfengine is one of the oldest configuration management systems and I didn’t expect that.

Given a configuration error, the server will give out the files to the agents. As the agent pulls are configured in the same promise files as the rest of the system an error in any file will result in the agent not being able to pull any new version.

Further is the syntax not easy at all and has some bogus limitations. For example it is not allowed to name a promise file with a dash. But instead of a warning or error, cfengine just can’t find the file.

This is not at all what I expect to get.

What I need is a system, which can’t deactivate itself or even better, just runs on a central server. I also didn’t want to run weird scripts just to get ruby compiled on the system to setup the configuration management. In my eyes, that is part of the job of the tool.

The only one I found which can handle that seems to be ansible. It is written in python and runs all commands remote with the help of python or in a raw mode. The first tests also looked very promising. I will keep posting, how it is going.

scan to samba share with HP Officejet pro 8600

Gibheer
2014-03-16
10-28

Yesterday I bought a printer/scanner combination, a HP Officejet pro 8600. It has some nice functions included, but the most important for us was the ability to print to a network storage. As I did not find any documentation on how it is possible to get the printer to speak with a samba share, I will describe it here.

To get started I assume, that you already have a configured and running samba server.

The first step is to create a new system user and group. This user will used to create a login on the samba server for the scanner. The group will hold all users which should have access to the scanned documents. The following commands are for freebsd, but there should be an equivalent for any other system (like useradd).

pw groupadd -n scans
pw useradd -n scans -u 10000 -c "login for scanner" -d /nonexistent -g scans -s /usr/sbin/nologin

We can already add the user to the samba user managament. Don’t forget to set a strong password.

smbpasswd -a scans

As we have the group for all scan users, we can add every account which should have access

pw groupmod scans -m gibheer,stormwind

Now we need a directory to store the scans into. We make sure, that none other than group members can modify data in that directory.

zfs create rpool/export/scans
chown scans:scans /export/scans
chmod 770 /export/scans

Now that we have the system stuff done, we need to configure the share in the samba config. Add and modify the following part

[scans]
comment = scan directory
path = /export/scans
writeable = yes
create mode = 0660
guest ok = no
valid users = @scans

Now restart/reload the samba server and the share should be good to go. The only thing left is to configure the scanner to use that share. I did it over the webinterface. For that, go to https://<yourscannerhere>/#hId-NetworkFolderAccounts. The we add a new network folder with the following data:

  • display name: scans
  • network path:
  • user name: scans
  • password:

In the next step, you can secure the network drive with a pin. In the third step you can set the default scan settings and now you are done. Safe and test the settings and everything should work fine. The first scan will be named scan.pdf and all following have an id appended. Too bad there isn’t a setting to append a timestamp instead. But it is still very nice t o be able to scan to a network device.

[cfengine] log to syslog

Gibheer
2014-02-24
21-51

When you want to start with cfengine, it is not exactly obvious how some stuff works. To make it easier for others, I will write about some stuff I find out in the process.

For the start, here is the first thing I found out. By default cfengine logs to files in the work directory. This can get a bit ugly, when the agent is running every 5min. As I use cf-execd, I added the option executorfacility to the exed section.

body executor control {
  executorfacility => "LOG_LOCAL7";
}

After that a restart of execd will result in logs appearing through syslog.

overhaul of the blog

Gibheer
2014-02-19
09-42

The new blog is finally online. It took us nearly more than a year to finally get the new design done.

First we replaced thin with puma. Thin was getting more and more a bother and didn’t really work reliable anymore. Because of the software needed, it was pinned to a specific version of rack, thin, rubinius and some other stuff. Changing one dependency meant a lot of working getting it going again. Puma together with rubinius make a pretty nice stack and in all the time it worked pretty well. We will see, how good it can handle running longer than some hours.

The next part we did was throw out sinatra and replace it with zero, our own toolkit for building small web applications. But instead of building yet another object spawning machine, we tried something different. The new blog uses a chain of functions to process a request into a response. This has the advantage that the number of objects kept around for the livetime of a request is minimized, the stack level is smaller and in all it should now need much less memory to process a request. From the numbers, things are looking good, but we will see how it will behave in the future.

On the frontend part we minimized the layout further, but found some nice functionality. It is now possible to view one post after another through the same pagination mechanism. This should make a nice experience when reading more a number of posts one after another.

We hope you like the new design and will enjoy reading our stuff in the future too.

show older