minimal nginx configuration

Gibheer
2015-03-25
22-11

As I was asked today, how I manage the nginx setup, I thought I write it down.

The configuration was inpsired by the blog entry of Zach Orr (looks like the blog post is gone since 2014). The setup consists of one main configuration and multiple domain specific configuration files which get sourced in the main config. If a domain is using certificates, these are pulled in in their respective files.

I will leave out the performance stuff to make the config more readable. As the location of the config files differs per platform, I will use $CONF_DIR as a placeholder.

main configuration

The main configuration $CONF_DIR/nginx.conf first sets some global stuff.

# global settings
user www www;
pid /var/run/nginx.pid;

This will take care of dropping the privileges after the start to the www user group.

Next is the http section, which sets the defaults for all server parts.

http {
  include      mime.types;
  default_type application/octet-stream;
  charset      UTF-8;

  # activate some modules
  gzip on;
  # set some defaults for modules
  ssl_protocols TLSv1.2 TLSv1.1 TLSv1;

  include sites/*.conf;
}

This part sets some default options for all server sections and helps to make the separate configuration easier. In this example the mime types are included (a large file with mime type definitions), the default charset and mime type is set.

In this section we can also active modules like gzip (see gzip on nginx) or set some options for modules like ssl (see ssl on nginx).

The last option is to include more config files from the sites directory. This is the directive which makes it possible to split up the configs.

server section config

The server section config may look different for each purpose. Here are some smaller config files just to show, what is possible.

static website

For example the file $CONF_DIR/sites/static.zero-knowledge.org.conf looks like this:

server {
  listen 80;
  server_name static.zero-knowledge.org;

  location / {
    root /var/srv/static.zero-knowledge.org/htdocs;
    index index.html;
  }
}

In this case a domain is configured delivering static content from the directory /var/src/static.zero-knowledge.org/htdocs on port 80 for the domain static.zero-knowledge.org`. If the root path is called in the browser, nginx will look for the *index.html to show.

reverse proxy site

For a reverse proxy setup, the config $CONF_DIR/sites/zero-knowledge.org.conf might look like this.

server {
  listen 80;
  server_name zero-knowledge.org;

  location / {
    proxy_pass http://unix:/tmp/reverse.sock;
    include proxy_params;
  }
}

In this case, nginx will also listen on port 80, but for the host zero-knowledge.org. All incoming requests will be forwarded to the local unix socket /tmp/reverse.sock. You can also define IPs and ports here, but for an easy setup, unix sockets might be easier. The parameter include proxy_params; includes the config file proxy_params to set some headers when forwarding the request, for example Host or X-Forwarded-For. There should be a number of config files already included with the nginx package, so best is to tkae a look in $CONF_DIR.

uwsgi setup

As I got my graphite setup running some days ago, I can also provide a very bare uwsgi config, which actually looks like the reverse proxy config.

server {
  listen 80;
  server_name uwsgi.zero-knowledge.org;

  location / {
    uwsgi_pass uwsgi://unix:/tmp/uwsgi_graphite.sock;
    include uwsgi_params;
  }
}

So instead of proxy_pass uwsgi_pass is used to tell nginx, that it has to use the uwsgi format. Nginx will also include the uwsgi parameters, which is like the proxy_params file a collection of headers to set.

conclusion

So this is my pretty minimal configuration for nginx. It helped me automate the configuration, as I just have to drop new config files in the directory and reload the server.

I hope you liked it and have fun.

pgstats - vmstat like stats for postgres

Gibheer
2015-03-02
20-51

Some weeks ago a tool got my attention - pgstats. It was mentioned in a blog post, so I tried it out and it made a very good first impression.

Now version 1.0 was released. It can be found in github.

It is a small tool to get statistics from postgres in intervals, just like with iostat, vmstat and other *stat tools. It has a number of modules to get these, for example for databases, tables, index usage and the like.

If you are running postgres, you definitely should take a look at it.

setting zpool features

Gibheer
2014-12-10
13-40

Before SUN was bought by Oracle, OpenSolaris had ever newer versions and upgrading was just an

$ zpool upgrade rpool

away. But since then, the open source version of ZFS gained feature flags.

POOL  FEATURE
---------------
tank1
      multi_vdev_crash_dump
      enabled_txg
      hole_birth
      extensible_dataset
      embedded_data
      bookmarks
      filesystem_limits

If you want to enable only one of these features, you may have already hit the problem, that zpool upgrade can only upgrade one pool or all.

The way to go is to use zpool set. Feature flags are options on the pool and can also be listed with zpool get.

$ zpool get all tank1 | grep feature
tank1  feature@async_destroy          enabled                        local
tank1  feature@empty_bpobj            active                         local
tank1  feature@lz4_compress           active                         local
tank1  feature@multi_vdev_crash_dump  disabled                       local
...

Enabling a feature, for example multi_vdev_crash_dump, would then be

$ zpool set feature@multi_vdev_crash_dump=enabled tank1

It will then disappear from the zpool upgrade output and be set to enabled active in zpool get.

using unbound and dnsmasq

Gibheer
2014-12-09
22-13

After some time of using an Almond as our router and always having trouble with disconnects, I bought a small apu1d4, an AMD low power board, as our new router. It is now running FreeBSD and is very stable. Not a single connection was dropped yet.

As we have some services in our network, like a fileserver and a printer, we always wanted to use names instead of IPs, but not a single router yet could provide that. So this was the first problem I solved.

FreeBSD comes with unbound preinstalled. Unbound is a caching DNS resolver, which helps answer DNS queries faster, when they were already queried before. I wanted to use unbound as the primary source for DNS queries, as the caching functionality is pretty nice. Further I wanted an easy DHCP server, which would also function as a DNS server. For that purpose dnsmasq fits best. There are also ways to use dhcpd, bind and some glue to get the same result, but I wanted as few services as possible.

So my setup constellation looks like this:

client -> unbound -> dnsmasq
             +-----> ISP dns server

For my internal tld, I will use zero. The dns server is called cerberus.zero and has the IP 192.168.42.2. The network for this setup is 192.168.42.0/24.

configuring unbound

For this to work, first we configure unbound to make name resolution work at all. Most files already have pretty good defaults, so we will overwrite these with a file in /etc/unbound/conf.d/, in my case /etc/unbound/conf.d/zero.conf.

server:
  interface: 127.0.0.1
  interface: 192.168.42.2
  do-not-query-localhost: no
  access-control: 192.168.42.0/24 allow
  local-data: "cerberus. 86400 IN A 192.168.42.2"
  local-data: "cerberus.zero. 86400 IN A 192.168.42.2"
  local-data: "2.42.168.192.in-addr.arpa 86400 IN PTR cerberus.zero."
  local-zone: "42.168.192.in-addr.arpa" nodefault
  domain-insecure: "zero"

forward-zone:
  name: "zero"
  forward-addr: 127.0.0.1@5353

forward-zone:
  name: "42.168.192.in-addr.arpa."
  forward-addr: 127.0.0.1@5353

So what happens here is the following. First we tell unbound, on which addresses it should listen for incoming queries. Next we staate, that querying dns servers in localhost is totally okay. This is needed to later be able to resolve addresses on the local dnsmasq. If your dnsmasq is running on a different machine, you can leave this out. With access-control we allow the network 192.168.42.0/24 to query the dns server. The next three lines tell unbound, that the name cerberus and cerberus.zero are one and the same machine, the DNS server. Without these two lines unbound would not resolve the name of the local server, even if its name would be stated in /etc/hosts. With the last line we enable name resolution for the local network. The key domain-insecure tells unbound, that this domain has no support for DNSSEC. DNSSEC is enabled by default on unbound.

The two forward-zone entries tell unbound, where it should ask for queries regarding the zero tld and the reverse entries of the network. The address in this case points to the dnsmasq instance. In my case, that is running on localhost and port 5353.

Now we can add unbound to /etc/rc.conf and start unbound for the first time with the following command

$ sysrc local_unbound_enable=YES && service local_unbound start

Now you should be able to resolve the local hostname already

$ host cerberus.zero
cerberus.zero has address 192.168.42.2

configuring dnsmasq

The next step is to configure dnsmasq, so that it provides DHCP and name resolution for the network. When adjusting the config, please read the comments for each option in your config file carefully. You can find an example config in /usr/local/etc/dnsmasq.conf.example. Copy it to /usr/local/etc/dnsmasq.conf and open it in your editor:

port=5353
domain-needed
bogus-priv
no-resolv
no-hosts
local=/zero/
except-interface=re0
bind-interfaces
local-service
expand-hosts
domain=zero
dhcp-range=192.168.42.11,192.168.42.200,255.255.255.0,48h
dhcp-option=option:router,192.168.42.2
dhcp-option=option:dns-server,192.168.42.2
dhcp-host=00:90:f5:f0:fc:13,0c:8b:fd:6b:04:9a,sodium,192.168.42.23,96h

First we set the port to 5353, as defined in the unbound config. On this port dnsmasq will listen for incoming dns requests. The next two options are to avoid forwarding dns requests needlessly. The option no-resolv avoids dnsmasq knowning of any other dns server. no-hosts does the same for /etc/hosts. Its sole purpose is to provide DNS for the local domain, so it needn’t to know.

The next option tells dnsmasq for which domain it is responsible. It will also avoid answering requests for any other domain.

except-interfaces tells dnsmasq on which interfaces not to listen on. You should enter here all external interfaces to avoid queries from the wide web detecting hosts on your internal network. The option bind-interfaces will try to listen only on the interfaces allowed instead of listening on all interfaces and filtering the traffic. This makes dnsmasq a bit more secure, as not listening at all is better than listening.

The two options expand-hosts and domain=zero will expand all dns requests with the given domain part, if it is missing. This way, it is easier to resolv hosts in the local domain.

The next three options configure the DHCP part of dnsmasq. First is the range. In this example, the range starts from 192.168.42.11 and ends in 192.168.42.200 and all IPs get a 48h lease time. So if a new hosts enters the network, it will be given an IP from this range. The next two lines set options sent with the DHCP offer to the client, so it learns the default route and dns server. As both is running on the same machine in my case, it points to the same IP.

Now all machines which should have a static name and/or IP can be set through dhcp-host lines. You have to give the mac address, the name, the IP and the lease time. There are many examples in the example dnsmasq config, so the best is to read these.

When your configuration is done, you can enable the dnsmasq service and start it

$ sysrc dnsmasq_enable=YES && service dnsmasq start

When you get your first IP, do the following request and it should give you your IP

$ host $(hostname)
sodium.zero has address 192.168.42.23

With this, we have a running DNS server setup with DHCP.

common table expressions in postgres

Gibheer
2014-10-13
21-45

Four weeks ago I was askes to show some features of PostgreSQL. In that presentation I came up with an interesting statement, with which I could show nice feature.

What I’m talking about is the usage of common table expressions (or short CTE) and explain.

Common table expressions create a temporary table just for this query. The result can be used anywhere in the rest of the query. It is pretty useful to group sub selects into smaller chunks, but also to create DML statements which return data.

A statement using CTEs can look like this:

with numbers as (
  select generate_series(1,10)
)
select * from numbers;

But it gets even nicer, when we can use this to move data between tables, for example to archive old data.

Lets create a table and an archive table and try it out.

$ create table foo(
  id serial primary key,
  t text
);
$ create table foo_archive(
  like foo
);
$ insert into foo(t)
  select generate_series(1,500);

The like option can be used to copy the table structure to a new table.

The table foo is now filled with data. Next we will delete all rows where the modulus 25 of the ID resolves to 0 and insert the row to the archive table.

$ with deleted_rows as (
  delete from foo where id % 25 = 0 returning *
)
insert into foo_archive select * from deleted_rows;

Another nice feature of postgres is the possibility to get an explain from a delete or insert. So when we prepend explain to the above query, we get this explain:

                            QUERY PLAN
───────────────────────────────────────────────────────────────────
 Insert on foo_archive  (cost=28.45..28.57 rows=6 width=36)
   CTE deleted_rows
     ->  Delete on foo  (cost=0.00..28.45 rows=6 width=6)
           ->  Seq Scan on foo  (cost=0.00..28.45 rows=6 width=6)
                 Filter: ((id % 25) = 0)
   ->  CTE Scan on deleted_rows  (cost=0.00..0.12 rows=6 width=36)
(6 rows)

This explain shows, that a sequence scan is done for the delete and grouped into the CTE deleted_rows, our temporary view. This is then scanned again and used to insert the data into foo_archive.

range types in postgres

Gibheer
2014-08-08
20-23

Nearly two years ago, Postgres got a very nice feature - range types. These are available for timestamps, numerics and integers. The problem is, that till now, I didn’t have a good example what one could do with it. But today someone gave me a quest to use it!

His problem was, that they had id ranges used by customers and they weren’t sure if they overlapped. The table looked something like this:

create table ranges(
  range_id serial primary key,
  lower_bound bigint not null,
  upper_bound bigint not null
);

With data like this

insert into ranges(lower_bound, upper_bound) values
  (120000, 120500), (123000, 123750), (123750, 124000);

They had something like 40,000 rows of that kind. So this was perfect for using range type queries.

To find out, if there was an overlap, I used the following query

select *
  from ranges r1
  join ranges r2
    on int8range(r1.lower_bound, r1.upper_bound, '[]') &&
       int8range(r2.lower_bound, r2.upper_bound, '[]')
 where r1.range_id != r2.range_id;

In this case, int8range takes two bigint values and converts it to a range. The string [] defines if the two values are included or excluded in the range. In this example, they are included. The output for this query looked like the following

 range_id │ lower_bound │ upper_bound │ range_id │ lower_bound │ upper_bound
──────────┼─────────────┼─────────────┼──────────┼─────────────┼─────────────
        2 │      123000 │      123750 │        3 │      123750 │      124000
        3 │      123750 │      124000 │        2 │      123000 │      123750
(2 rows)

Time: 0.317 ms

But as I said, the table had 40,000 values. That means the set to filter has a size of 1.6 billion entries. The computation of the query took a very long time, so I used another nice feature of postgres - transactions.

The idea was to add a temporary index to get the computation done in a much faster time (the index is also described in the documentation).

begin;
create index on ranges using gist(int8range(lower_bound, upper_bound, '[]'));
select *
  from ranges r1
  join ranges r2
    on int8range(r1.lower_bound, r1.upper_bound, '[]') &&
       int8range(r2.lower_bound, r2.upper_bound, '[]')
 where r1.range_id != r2.range_id;
rollback;

The overall runtime in my case was 300ms, so the writelock wasn’t that much of a concern anymore.

learning the ansible way

Gibheer
2014-08-08
19-13

Some weeks ago I read a blog post about rolling out your configs with ansible as a way to learn how to use it. The posts wasn’t full of information how to do it, but his repository was a great inspiration.

As I stopped using cfengine and instead wanted to use ansible, that was a great opportunity to further learn how to use it and I have to say, it is a really nice experience. Apart from a bunch configs I find every now and then, I have everything in my config repository.

The config is split at the moment between servers and workstations, but using an inventory file with localhost. As I mostly use freebsd and archlinux, I had to set the python interpreter path to different locations. There are two ways to do that in ansible. The first is to add it to the inventory

[hosts]
localhost

[hosts:vars]
ansible_connection=local
ansible_python_interpreter=/usr/local/bin/python2

and the other is to set it in the playbook

- hosts: hosts
  vars:
    ansible_python_interpreter: /usr/local/bin/python2
  roles:
    - vim

The latter has the small disadvantage, that running plain ansible is not possible. Ansible in the command and check mode also needs an inventory and uses the variables too. But if they are not stated there, ansible has no idea what to do. But at the moment, it isn’t so much a problem. Maybe that problem can be solved by using a dynamic inventory.

What I can definitely recommend is using roles. These are descriptions on what to do and can be filled with variables from the outside. I have used them bundle all tasks for one topic. Then I can unclude these for the hosts I want them to have, which makes rather nice playbooks. One good example is my vim config, as it shows how to use lists.

All in all I’m pretty impressed how well it works. At the moment I’m working on a way to provision jails automatically, so that I can run the new server completely through ansible. Should make moving to a new server in the fututre much easier.

playing with go

Gibheer
2014-04-04
22-39

For some weeks now I have been playing with Go, a programming language developed with support from google. I’m not really sure yet, if I like it or not.

The ugly things first - so that the nice things can be enjoyed longer.

Gos package management is probably one of the worst points of the language. It has an included system to load code from any repository system, but everything has to be versioned. The weird thing is that they forgot to make it possible to pin the dependencies to a specific version. Some projects are on the way to implement this feature, but it will probably take some time.

What I also miss a shell to test code and just try stuff. Go is a language which is compiled. I really like it for small code spikes, calculations and the like. I really hope they will include it sometime in the future, but I doubt it.

With that comes also a very strict project directory structure, which makes it nearly impossible to just open a project and code away. One has to move into the project structure.

The naming of functions and variables is strict too. Everything is bound to the package namespace by default. If the variable, type or function begins with a capital letter, it means that the object is exported and can be used from other packages.

// a public function
func FooBar() {
}

// not a public function
func fooBar() {
}

Coming from other programming languages, it might be a bit irritating and I still don’t really like the strictness, but my hands learned the lesson and mostly capitalize it for me.

Now the most interesting part for me is, that I can use Go very easily. I have to look for much of the functions, but the syntax is very easy to learn. Just for fun I built a small cassandra benchmark in a couple of hours and it works very nice.

After some adjustments it even ran in parallel and is now stressing a cassandra cluster for more than 3 weeks. That was a very nice experience.

Starting a thread in Go is surprisingly easy. There is nothing much needed to get it started.

go function(arg2, arg2)

It is really nice to just include a small two letter command to get the function to run in parallel.

Go also includes a feature I wished for some time in Ruby. Here is an example of what I mean

def foo(arg1)
  return unless arg1.respond_to?(:bar)
  do_stuff
end

What this function does is test the argument for a specific method. Essentially it is an interface without a name. For some time I found that pretty nice to ask for methods instead of some weird name someone put behind the class name.

The Go designers found another way for the same problem. They called them also interfaces, but they work a bit differently. The same example, but this time in Go

type Barer interface {
  func Bar()
}

func foo(b Bar) {
  do_stuff
}

In Go, we give our method constraint a name and use that in the function definition. But instead of adding the name to the struct or class like in Java, only the method has to be implemented and the compiler takes care of the rest.

But the biggest improvement for me is the tooling around Go. They deliver it with a formatting tool, a documentation and a test tool. And everything works blazingly fast. Even the compiler can run in mere seconds instead of minutes. It actually makes fun to have such a fast feedback cycle with a compiled language.

So for me, Go is definitely an interesting but not perfect project. The language definition is great and the tooling is good. But the strict and weird project directory structure and project management is currently a big problem for me.

I hope they get that figured out and then I will gladly use Go for some stuff.

no cfengine anymore

Gibheer
2014-03-16
10-51

I thought I could write more good stuff about cfengine, but it had some pretty serious issues for me.

The first issue is the documentation. There are two documents available. One for an older version but very well written and a newer one which is a nightmare to navigate. I would use the older version, if it would work all the time.

The second issue is that cfengine can destroy itself. cfengine is one of the oldest configuration management systems and I didn’t expect that.

Given a configuration error, the server will give out the files to the agents. As the agent pulls are configured in the same promise files as the rest of the system an error in any file will result in the agent not being able to pull any new version.

Further is the syntax not easy at all and has some bogus limitations. For example it is not allowed to name a promise file with a dash. But instead of a warning or error, cfengine just can’t find the file.

This is not at all what I expect to get.

What I need is a system, which can’t deactivate itself or even better, just runs on a central server. I also didn’t want to run weird scripts just to get ruby compiled on the system to setup the configuration management. In my eyes, that is part of the job of the tool.

The only one I found which can handle that seems to be ansible. It is written in python and runs all commands remote with the help of python or in a raw mode. The first tests also looked very promising. I will keep posting, how it is going.

scan to samba share with HP Officejet pro 8600

Gibheer
2014-03-16
10-28

Yesterday I bought a printer/scanner combination, a HP Officejet pro 8600. It has some nice functions included, but the most important for us was the ability to print to a network storage. As I did not find any documentation on how it is possible to get the printer to speak with a samba share, I will describe it here.

To get started I assume, that you already have a configured and running samba server.

The first step is to create a new system user and group. This user will used to create a login on the samba server for the scanner. The group will hold all users which should have access to the scanned documents. The following commands are for freebsd, but there should be an equivalent for any other system (like useradd).

pw groupadd -n scans
pw useradd -n scans -u 10000 -c "login for scanner" -d /nonexistent -g scans -s /usr/sbin/nologin

We can already add the user to the samba user managament. Don’t forget to set a strong password.

smbpasswd -a scans

As we have the group for all scan users, we can add every account which should have access

pw groupmod scans -m gibheer,stormwind

Now we need a directory to store the scans into. We make sure, that none other than group members can modify data in that directory.

zfs create rpool/export/scans
chown scans:scans /export/scans
chmod 770 /export/scans

Now that we have the system stuff done, we need to configure the share in the samba config. Add and modify the following part

[scans]
comment = scan directory
path = /export/scans
writeable = yes
create mode = 0660
guest ok = no
valid users = @scans

Now restart/reload the samba server and the share should be good to go. The only thing left is to configure the scanner to use that share. I did it over the webinterface. For that, go to https://<yourscannerhere>/#hId-NetworkFolderAccounts. The we add a new network folder with the following data:

  • display name: scans
  • network path:
  • user name: scans
  • password:

In the next step, you can secure the network drive with a pin. In the third step you can set the default scan settings and now you are done. Safe and test the settings and everything should work fine. The first scan will be named scan.pdf and all following have an id appended. Too bad there isn’t a setting to append a timestamp instead. But it is still very nice t o be able to scan to a network device.

show older