Thursday, 22 June 2017

Vectorization and broadcasting

At the moment I'm actually ahead of my schedule and this week I started to work on support for vectorization on N-dimensional arrays. The by far biggest challenge was to implement proper broadcasting and most of this post will be devoted to going through that. At the end I also mention some of the other things I have done during the week.

Broadcasting arrays

At the moment I have implement support for broadcasting on all binary functions. Since all binary functions behave similarly in respect to broadcasting I will use $+$ in all my example below, but this could in principle be any binary function working on intervals.

When adding to arrays, $A, B$, of the same size the result is just an arrays of the same size with each entry containing the sum of the corresponding entries in $A$ and $B$. If $A$ and $B$ does not have the same size then we try to perform broadcasting. The simplest form of broadcasting is when $A$ is an arrays and $B$ is a scalar. Then we just take the value of $B$ and add to every element in $A$. For example

> A = infsupdec ([1, 2; 3, 4])
A = 2×2 interval matrix
   [1]_com   [2]_com
   [3]_com   [4]_com
> B = infsupdec (5)
B = [5]_com
> A + B
ans = 2×2 interval matrix
   [6]_com   [7]_com
   [8]_com   [9]_com

However it is not only when one of the inputs is a scalar that broadcasting can be performed. Broadcasting is performed separately for each dimension of the input. We require either that the dimensions are equal, and no broadcasting is performed, or that one of the inputs have that dimension equal to $1$, we then concatenate this input along that dimension until they are of equal size. If for example $A$ has dimension $4\times4\times4$ and $B$ dimension $4\times4\times1$ we concatenate $B$ with itself along the third dimension four times to get two arrays of the same size. Since a scalar has all dimensions equal to 1 we see that it can be broadcasted to any size. Both $A$ and $B$ can also be broadcasted at the same time, along different dimensions, for example

> A = infsupdec (ones (1, 5, 2))
A = 1×5×2 interval array
ans(:,:,1) =
   [1]_com   [1]_com   [1]_com   [1]_com   [1]_com
ans(:,:,2) =
   [1]_com   [1]_com   [1]_com   [1]_com   [1]_com
> B = infsupdec ([1, 2, 3, 4, 5; 6, 7, 8, 9, 10])
B = 2×5 interval matrix
   [1]_com   [2]_com   [3]_com   [4]_com    [5]_com
   [6]_com   [7]_com   [8]_com   [9]_com   [10]_com
> A + B
ans = 2×5×2 interval array
ans(:,:,1) =
   [2]_com   [3]_com   [4]_com    [5]_com    [6]_com
   [7]_com   [8]_com   [9]_com   [10]_com   [11]_com
ans(:,:,2) =
   [2]_com   [3]_com   [4]_com    [5]_com    [6]_com
   [7]_com   [8]_com   [9]_com   [10]_com   [11]_com

The implementation

I'll go through a little bit about my implementation. I warn you that I'm not that familiar with the internals of Octave so some things I say might be wrong, or at least not totally correct.

Internally all, numerical, arrays are stored as a linear vector and the dimensions are only metadata. This means that the most efficient way to walk through an array is with a linearly increasing index. When $A$ and $B$ have the same size the most efficient way to sum them is to linearly go through the arrays. In pseudo code

// Calculate C = A + B
for (int i = 0; i < numel (A); i++) {
  C(i) = A(i) + B(i);
}

This works fine, and apart from unrolling the loop or doing optimizations like that it is probably the most efficient way to do it.

If $A$ and $B$ are not of the same size then one way to do it would be to simply extend $A$ or/and $B$ along the needed dimensions. This would however require coping a lot of data, something we want to avoid (memory access is expensive). Instead we try to be smart with our indexing to access the right data from both $A$ and $B$.

After asking on the IRC-channel I got pointed to this Octave function which performs broadcasting. My implementation, which can be found here, is heavily inspired by that function.

Performance

Here I compare the performance of the new implementation with the old one. Since the old one could only handle matrices we are limited by that. We can measure the time it takes to add two matrices $A$, $B$ with the code

tic; A + B; toc;

We do 10 runs for each test and all times are in seconds.

Addition of large matrices

Case 1: A = B = infsupdec (ones (1000, 1000));
       Old         New
       0.324722    0.277179
       0.320914    0.276116
       0.322018    0.276075
       0.318713    0.279258
       0.332041    0.279593
       0.318429    0.279987
       0.323752    0.279089
       0.317823    0.276036
       0.320509    0.280964
       0.320610    0.281123
Mean:  0.32195     0.27854
Case 2: A = B = infsupdec (ones (10, 100000));
        Old         New
        0.299321    0.272691
        0.297020    0.282591
        0.296460    0.274298
        0.294541    0.279661
        0.298306    0.277274
        0.301532    0.275531
        0.298163    0.278576
        0.298954    0.279868
        0.302849    0.275991
        0.297765    0.278806
Mean:   0.29849    0.27753

Case 3: A = B = infsupdec (ones (100000, 10));
        Old         New
        0.286433    0.279107
        0.289503    0.278251
        0.297562    0.279579
        0.292759    0.283311
        0.292983    0.281306
        0.290947    0.282310
        0.293025    0.286172
        0.294153    0.278886
        0.293457    0.278625
        0.296661    0.280804
Mean:   0.29275     0.28084

Broadcasting scalars

Case 4: A = infsupdec (ones (1000, 1000));
             B = infsupdec (1);
        Old         New
        0.298695    0.292419
        0.298158    0.292274
        0.305242    0.296036
        0.295867    0.291311
        0.296971    0.297255
        0.304297    0.292871
        0.298172    0.300329
        0.297251    0.291668
        0.299236    0.294128
        0.300457    0.298005
Mean;   0.29943     0.29463

Case 5: A = infsupdec (1);
             B = infsupdec (ones (1000, 1000));
         Old         New
        0.317276    0.291100
        0.316858    0.296519
        0.316617    0.292958
        0.316159    0.299662
        0.317939    0.301558
        0.322162    0.295338
        0.321277    0.293561
        0.314640    0.291500
        0.317211    0.295487
        0.317177    0.294376
Mean:   0.31773     0.29521

Broadcasting vectors

Case 6: A = infsupdec (ones (1000, 1000));
             B = infsupdec (ones (1000, 1));
        Old         New
        0.299546    0.284229
        0.301177    0.284458
        0.300725    0.276269
        0.299368    0.276957
        0.303953    0.278034
        0.300894    0.275058
        0.301776    0.276692
        0.302462    0.282946
        0.304010    0.275573
        0.301196    0.273109
Mean:   0.30151     0.27833

Case 7: A = infsupdec (ones (1000, 1000));
             B = infsupdec (ones (1, 1000));
         Old         New
        0.300554    0.295892
        0.301361    0.294287
        0.302575    0.299116
        0.304808    0.294184
        0.306700    0.291606
        0.301233    0.298059
        0.301591    0.292777
        0.302998    0.290288
        0.300452    0.291975
        0.305531    0.290178
Mean:   0.30278     0.29384

We see that in all cases the new version is faster or at least equally fast as the old version. In the old version the order of the input made a slight difference in performance (case 4 vs case 5). In the new version both inputs are treated in exactly the same way so we no longer see that difference.

Possible improvements

In theory the cases when we broadcast a scalar could be the fastest ones. If $B$ is a scalar we could, in pseudo code, do something similar to
// Calculate C = A + B with B scalar
for (int i = 0; i < numel (A); i++) {
  C(i) = A(i) + B;
}

This is however not implemented at the moment. Instead we use the ordinary routine to calculate the index for $B$ (since it is a scalar it will always evaluate to $1$). If we would like to optimize more for this case we could add a check for if $A$ or $B$ are scalars and then optimize for that. Of course this would also make the code more complicated, something to watch out for. At the moment I leave it like this but if we later want to optimize for that case it could be done.

Other work

Apart from the work to fix the broadcasting for binary functions there were very little to do for many of the functions. All binary functions that use this code, and all unary functions using an even simpler code, worked directly after fixing the oct-files. Some of them required small changes to the documentation but other than that the octave-scripts were fine. So mainly it has been a matter of actually going through all files and check that they actually did work.

Bug #51283

When going through all the functions I noticed a bug in the interval version of $\sin$,

 > sin (infsupdec (0))
ans = [0]_com
> sin (infsupdec ([0, 0]))
ans = 1×2 interval vector
   [0, -0]_com   [0, -0]_com

The second version here is wrong, $-0$ should never be allowed as a value for the supremum of an interval. I was able to track this down to how Octaves $\max$ function works, see bug #51283. As Oliver writes there the exact behaviour of the $\max$-function is not specified in IEEE Std 754-2008 so we cannot rely on that. To solve this I have added a line to manually set all $-0$ to $+0$ in the supremum of the interval.

 

Friday, 16 June 2017

Construction and Printing

This week I have started to work on methods for constructing and printing N-dimensional arrays of intervals. In my timeline I estimated that this work would take 2 weeks. However in this first week I have managed to complete most of the work. I will give some comments on how I have worked with the Mercurial repository, how the work went and different things I encountered along the path.

Working with Mercurial

This is essentially the first time I'm using Mercurial for revision control, though I have used git before. However I quickly learned how to use it for the basic tasks that I need, committing, comparing files and checking the history. As mentioned in a previous post you can find my repository here [1].

Coding style

When I started to work with the files I realized that they did not follow Octaves coding standard [2]. After a short discussion on the mailing list we decided that I will update the files I change to follow the standard coding style. Usually it is not a good idea to change coding style and add functionality in the same commit. However most of the changes to coding style are only white space changes so they can be ignored using the -w flag in Mercurial. Thus we decided that as long as the coding style changes are only such that it is ignored with -w I will do it in the same commit as the added functionality. If there are some coding style changes that's not only white space, the most common example is to long lines, I do a commit with only changes to the coding style first. So if you want to take a look at the functionality I have added you will probably want to use the -w flag. Note however that I have not updated the coding style for any files I have not changed otherwise.

Committing

Normally I do one commit for each file, though in many cases the bare intervals and the decorated intervals have almost identical functions and in that case I commit changes to them both at the same time. Of course it also happens that I have to go back and do more changes to a files, in that case I just do another commit.

The actual work

The work went much faster than I expected. The main reason for this is that Octave has very good support for indexing. For example expressions like

isnai(x.inf <= x.sup) = false;

works just as well for matrices as for N-dimensional arrays. In fact the constructor for bare intervals even worked for N-dimensional arrays from the beginning, there I only had to do slight modification to the documentation and add some tests!

Not all functions were that easy though. Some functions that have not been updated in a while clearly assumed the input was a matrix, for example in $hull$

sizes1 = cellfun ("size", l, 1);
sizes2 = cellfun ("size", l, 2);


In most cases I only needed to add more general indexing, often times even making the code clearer.

In some functions all I had to do was to remove the check on the input data so that it would accept N-dimensional arrays. This was true in for example $cat$ were all I had to do was to remove the check and do some minor modifications to the documentation.

I can conclude with saying that Octave has great support for working with N-dimensional arrays. Since internally the data for intervals are stored only as arrays I could make good use of it!

Noteworthy things

While most functions were straight forward to modify some required some thought. How should they even work for N-dimensional input?

Disp

When modifying the $disp$-function I chose to mimic how Octave handles displaying N-dimensional arrays. I noticed that this is different from how Matlab handles it. In Matlab we have

> x = zeros (2, 2, 2)

x(:,:,1) =

     0     0
     0     0


x(:,:,2) =

     0     0
     0     0


while in Octave it's

> x = zeros (2, 2, 2)
x =

ans(:,:,1) =

   0   0
   0   0

ans(:,:,2) =

   0   0
   0   0


I don't know the choice behind Octaves version. At least at first glance I think I prefer the way Matlab does it. But since I'm working in Octave I chose that style.

The next question was how to handle the subset symbol, $\subset$. The interval package uses $=$ or $\subset$ depending on if the string representation is exact or not. For example

> x = infsup (1/2048, 1 + 1/2048);
> format short; x
x ⊂ [0.00048828, 1.0005]
> format long; x
x = [0.00048828125, 1.00048828125]

How should this be handled for N-dimensional arrays? One way would be to switch all $=$ to $\subset$ is the representation is not exact. Another to use $\subset$ on all submatrices that does not have an exact string representation. The third way, and how it is implemented now, is to only change the first $=$ to $\subset$, the one after the variable name. Like this

> x(1,1,1:2) = infsup (1/2048, 1 + 1/2048)
x ⊂ 1×1×2 interval array

ans(:,:,1) =   [0.00048828, 1.0005]
ans(:,:,2) =   [0.00048828, 1.0005]


This might be a bit odd when you first look at it, on some places we use $=$ and on some $\subset$. Though I think it somehow makes sense, we are saying that $x$ is a subset of the $1\times1\times2$ interval array given by

ans(:,:,1) =   [0.00048828, 1.0005]
ans(:,:,2) =   [0.00048828, 1.0005]

which actually is true. Anyway I will leave like this for now and then we might decide to switch it up later.

linspace and mince

The standard implementation of $linspace$ only supports scalar or vector input. It could be generalized to N-dimensional arrays by for example returning a N+1-dimensional array were the last dimension corresponds to the linearly spaced elements. But since this has not been done in the standard implementation I will at least wait with adding for intervals.

The function $mince$ can be seen as a interval generalization of $linspace$. It  takes an interval and returns an array of intervals whose union cover it. This could similarly be expanded to N dimensions by creating the array along the N+1 dimension. But again we choose to at least wait with adding this.

meshgrid and ndgrid

The interval package already has an implementation of $meshgrid$. But since it previously did not support 3-dimensional arrays it had to fit 3-d data in a 2-d matrix. Now that it supports 3-d data it can output that instead.

Currently the interval package does not implement $ndgrid$. When I looked into it I realized that the standard implementation of $ndgrid$ actually works for interval arrays as well. I have not looked into the internals but in principle it should only need the $cat$ function, which is implemented for intervals. Further I noticed that the standard $meshgrid$ also works for intervals. However the interval implementation differs in that it converts all input to intervals, were as the standard implementation allows for non-uniform output. Using the interval implementation of $meshgrid$ we have

> [X Y] = meshgrid (infsup (1:3), 4:6)
X = 3×3 interval matrix

   [1]   [2]   [3]
   [1]   [2]   [3]
   [1]   [2]   [3]

Y = 3×3 interval matrix

   [4]   [4]   [4]
   [5]   [5]   [5]
   [6]   [6]   [6]

but if we fall back to the standard implementation (by removing the interval implementation) we get

> [X Y] = meshgrid (infsup (1:3), 4:6)
X = 3×3 interval matrix

   [1]   [2]   [3]
   [1]   [2]   [3]
   [1]   [2]   [3]

Y =

   4   4   4
   5   5   5
   6   6   6

Note that the last matrix is not an interval matrix.

So the question is, should we implement a version of $ndgrid$ that converts everything to intervals or should we remove the implementation of $meshgrid$? It's at least most likely not a good idea that the functions are different. I think that removing the implementation of $meshgrid$ makes most sense. First of all it's less code to maintain, which is always nice. Secondly you can manually convert all input to the function to intervals if you want uniform output. If you do not want uniform output then the standard implementation works were as the interval implementation does not, so the standard implementation is more general in a sense.

We have to choose what to do, but for now I leave it as it is.

Non-generalizable functions

From what I have found there is no way to create a 3-dimensional array in Octave in the same way you can create a 2-dimensional one with for example

M = [1, 2; 3, 4];

Instead higher dimensional arrays have to be created using other functions, for example $reshape$ or $zeros$, or by specifying the submatrices directly

M(:,:,1) = [1, 2; 3, 4];
M(:,:,2) = [5, 6; 7, 8];

This means that the functions $\_\_split\_interval\_literals\_\_$, which is used to split a string like $"[1, 2; 3, 4]"$ into its separate components, cannot really be generalized to N dimensions.


[1] https://sourceforge.net/u/urathai/octave/ci/default/tree/
[2] http://wiki.octave.org/Octave_style_guide

Wednesday, 31 May 2017

The first day and my repository

Yesterday I handed in the last exam for the semester(!) and today I started to work on the project full time. I begun by setting up a repository for my code. At the moment I have only cloned the interval package, I'm still to make my own changes to it.

I started to read the constructor for intervals to see how it works. While doing that I realized that there is much less to do than I thought. The constructor can actually handle creating intervals from numerical arrays of any dimension. The function displaying the interval can only handle up to two dimension but if you look at the internal state it actually has more dimensions. It is not perfect though, it does not work for decorated intervals and it cannot handle strings and cells as input. But still, it makes it much easier for me.

I will continue to work and hopefully I have actually committed something by the end of the week. Next week I will be away but after that I will code for the rest of the summer.

Sunday, 28 May 2017

Timeline

This post will be about specifying a timeline for my work this summer. As I mentioned in the introductory post the work will be about implementing support for higher dimensional arrays in the interval package. To begin with I have divided the work into 5 different parts:

1. Construction and Printing

This part will be about implementing functions for creating and printing arrays. It will mainly consist of modifying the standard constructor and all the different functions used for printing intervals so they can handle higher dimensional arrays.

2. Vectorized Functions

Here I will work on generalizing all functions supporting vectorization to also support arrays of higher dimensions.

3. Folding Functions

Here I will work on generalizing all functions implementing some sort of folding to support higher dimensions. By folding I mean taking a multidimensional array as input and returning an array of lower dimension. Examples of these functions are $sum$, $prod$ and $dot$.

4. Plotting

I'm not sure what support Octave has for plotting higher dimensional arrays, but if there are some functions which could also be useful for intervals I will try to implement them here.

5. Documentation

I'll write the documentation alongside the rest of the work. In the end I will try to add some usage examples and integrate it better with the standard documentation.

So these are the parts in which I have divided my work in and the timeline will be

  • Phase 1 (30/5 - 30/6)
    • Week 1: Setting up
    • Week 2-3: Construction and Printing
    • Week 4: Vectorized Functions
  • Phase 2 (3/7 - 28/7)
    • Week 5: Continue on Vectorized Functions 
    • Week 6-7: Folding Functions
    • Week 8: Plotting
  • Phase 3 (31/7 - 25/8)
    • Week 9: Continue on Plotting
    • Week 10-11: Documentation
    • Week 12: Finishing up!

My first week will be rather short since I have an exam due in the middle of the week. After that I will also be away for a week not working on the project, I have no counted that week in the timeline above.

Sunday, 21 May 2017

Introduction

Hi! I'm Joel Dahne and I will be using this blog to share the progress of my work for Google Summer of Code 2017 under GNU Octave where I will be working on improving the interval package.

About me

Currently I'm a master student in mathematics at Uppsala University, Sweden, I'm just about to complete my first year and have one more to go. I first encountered interval numerics with my bachelor thesis, during which I tried to generalize the method described in this paper to higher dimensions. The code from the project is available on GitHub, https://github.com/Urathai/generateZeros. Working on that project ignited my interest for interval numerics and I want to work to increase its availability to the common user.

The project

In short my project is about adding support for N-dimensional arrays in the interval package. At the moment the package only has support for up to 2-dimensional arrays. Most of the time this is enough but sometimes it is limiting. The goal is to have the syntax identical to the one for normal arrays. Switching from floats to intervals should be as easy as adding $infsup$ in front of the variable.

During my time preparing I have also noticed that some of the folding functions (e.g. $sum$, $prod$ and $dot$) handle some edge cases differently than the standard implementation. Hopefully this can be resolved at the same time.

This blog

During my project I will try to update this blog with my current progress. It's my first time writing a blog so we will see how it goes. The next step for me is to create more detailed plan of how I should structure my work during the summer, this will most likely be the subject of my next post.