Friday, 12 August 2016

Brain colours

A bunch of new colour scales for brain imaging

In brain imaging, we do lots of pictures - and in colour. One problem is that we all use different colour scales, and another problem is that the scales we choose are kind of random.

Choosing colour scales.

Lucky for us, the choice has been worked out by someone else. Christen et al. published a paper in 2013 looking at exactly this and they made some recommendations.

Here they are.

I simply followed these recommendations, and use cubehelix. This schema for generating colours has the advantage of having intensity increase at the same time as changing colours.

You can get the Matlab function to generate maps here.
You can download maps as csv files here.


Christen et al. (2013) Colorful brains: 14 years of display practice in functional neuroimaging, NeuroImage, 73, 30-39,

Green, D. A., 2011, `A colour scheme for the display of astronomical intensity images', Bulletin of the Astronomical Society of India, 39, 289.
(2011BASI...39..289G at ADS.)

Show me the data

Visualizing Data per group or condition

Nothing seems simpler than showing people how data look like. Yet, in many cases we use some ugly graphs that do not show the data, instead we show summary statistics and other useless or non-intuitive measures  (yes I'm talking standard errors or 95% confidence intervals).

I'm guilty like everyone else. Recently, many psychologists, brain imagers, statisticians engaged in working out better ways to present data, and violin plots seem to be the right way to go.

What do you need to show

For starter you need to show the data, i.e. scatter plots. Then, you want to check if you have outliers. Finally, you need to show how the data are distributed. There many ways to do that, like histograms. But here we want to show that on top of the scatter plots, for that we can use kernel density estimates. The advantage of KDE is that is shows the 'population' from which data are coming from (well it is supposed to).

Now that we show the data, we need to show the summary statistics estimator. By default, most people use the mean but if the distribution is skewed, that is not necessarily a good option. Anyway, once you have this, you need of course to display the variation of that estimator.

My solution

I wrote a Matlab code do just that. In the following I explain what are the defaults, and why I think these are the best choices.

Outliers detection

This is a pretty complicated field, and each method has pros and cons. Most often there is a balance to find between false positives and false negatives (see for instance some simulations I have done here).  Here I choose to use S-outliers, which are based on the distance between each pair of data points and has the advantage of not assuming symmetry of distributions.

Density estimation

I tried many functions before writing this one. One of the reason that pushed me to do so relates to this: the densities are not bounded in many cases ! and that's just silly. So you show your data, say reaction times, and you have your plots going in the negative values. Same with accuracy, your density goes beyond 100%. Come on, that is not possible.  Here by default I use an histogram. Not any, but a random average shifted histogram (RASH). That means it is not parametric (no a-priori shape), and it is bounded.

Estimators and Intervals

The function by defaults uses the 5th decile of the Harrell Davis estimator, which basically is the median. Finally, I plot the 95% high density intervals using a Bayesian bootstrap. This has the advantage to give you a straightforward interpretation of what you see: this is the 95% prob. of the estimator.


The colors are picked up from a scale I make using cube-helix. That means, once printed in gray scale, it still shows differences, but of intensity only (more color-blind friendly).


Bourel et al. (2014) Random average shifted histograms. Computational Statistics & Data Analysis, 79, 149-164,

Morey et al. (2016) The fallacy of placing confidence in confidence intervals
Psychon Bull Rev 23: 103. doi:10.3758/s13423-015-0947-8

Harrell & Davis (1982). A new distribution-free quantile estimator Biometrika (1982) 69 (3): 635-640 doi:10.1093/biomet/69.3.635 

Rousseeuw & Croux (1993). Alternatives to the the median absolute deviation. Journal of the American Statistical Association, 88 (424) p 1273-1263

Thursday, 7 July 2016

What happened during #OHBM2016

I'm reviewing here what has been discussed during #OHBM2016 based on the twitter hashtag.


Twitter analytics

During the conference, I had fun making word clouds from the twitter feed of the hashtag #OHBM2016 .  Attendants could  see the word clouds in between every presentation, and I think it made the welcome screen look pretty cool (you can find them on the @OHBM_members channel and on the OHBM facebook page). In case you thought some information was missing, that is simply because it was either not that frequently discussed on Twitter or did not show it (not all words appear depending on design and size). There was no censoring, and you can blame me if something was not to your liking.

Method: Using google tools I fetched #OHBM2016 for a given day and then ran a macro in notepad++. The macro removes the following names: RT, #OHBM2016, (and versions of that) and then splits sentences into a long vector of words from which I generated the clouds. If I noticed (visually) redundancy, I merged the words: for instance 50 Poldrack and 100 @russpoldrack woud give 150 @russpoldrack. 

For this last word cloud, I fetched #OHBM2016 from Sunday 26 June to Friday 1st July, and ran the macro. Again I merged redundancy and I added back @ and http, while removing non informative words. This made a nicer cloud with twitter handles and links. To check what was discussed and by who, I crossed the top tweets with post on personal accounts.

The exciting stuffs

The most discussed lectures were those of Tim Behrens and Fernando Lopes da Silva, closely followed by the talk from Gael Varoquaux. The main topics that engaged attendees were connectivity analyses, machine learning, power analyses, BIDS and yes, Brexit.

Connectivity analyses

Thomas Yeo aptly summarized the current state of connectivity analyses  on the OHBM blog so no need to talk more about it.


Machine learning

Pattern recognition for brain imaging relies a lot on machine learning techniques these days. Gael Varoquaux had lots of comments about his talk: 'Cross-validation to assess decoder performance: the good, the bad, and the ugly' , in which he shows that leave one out strategies are biased and that N-folds cross validation, providing we keep the data structure, works much better.


Power analyses

I am really glad that use of power analyses is now at the forefront of neuroimagers’ discussions. During Sunday’s reproducibility workshop I discussed and presented two of the main tools used to carry out power analyses on full maps: fMRIpower and neuropower . These tools were then presented byJeanette Mumford (creator of fMRIpower) and Joke Durnez (creator of neuropower) in the Open Science SIG room.



Another favorite topic of mine: data sharing. BIDS, or to give it its full name -  Brain Imaging Data Structure, is driven by Chris Gorgolewski, and describes how to structure and store your data in an easily shareable way. It provides advice on how to name files and how to create simple metadata text files (tsv and json). Using  BIDS doesn’t require programming knowledge, and does substantially improve data sharing, by allowing machines to read data easily.

The controversies


A paper from Anders Eklund et al. about failure to control the family-wise error rate (FWER) using cluster size was recently published in PNASand elicited many comments, not just online but also from the floor. The paper suggests that cluster size correction significantly inflates false-positives, thus ddressing the extremely important issue of controlling FWER, and is a must read along with the comment from Glass Brain awardee Karl Friston and Guillaume Flandin.

From our community, gender imbalance and diversity was frequently discussed and added to the #GenderAvenger hashtag. It was often commented that committee members and awardees were predominantly white males from wealthy countries. The Organization is well aware of this and has actively sought to reflect the geographic diversity of our membership as well as to balance the number of male and female session and keynote speakers. Council takes this feedback of the need to do more seriously and is actively at work to further address these issues, and push for all aspects of OHBM to become as diverse as the members it represents .

Top 5 twitter users 

During the conference, the OHBM and OHBM_SciNews accounts retweeted posts from or with twitter users mentioned: thanks to the top 5: @kirstie_j, @NKriegeskorte, @pierre_vanmedge, @ChrisFiloG, @ten_photos.

Monday, 11 January 2016

On confidence vs capture percentage intervals

How often Confidence Intervals of the mean 

captures the next sample mean?

The Confidence Interval of the sample mean

A typical interpretation error for confidence intervals (CI) is to think that they provide coverage for the statistic of interest. The 95% CI of the sample mean is thus often interpreted either as the interval for which there is 95% chance to include the true mean (Morey, Hoekstra, Rouder, Lee, & Wagenmakers, 2015). The correct interpretation is an interval for which the true mean will fall 95% of the time, on repeated measurements. This is an important distinction, as the correct interpretation does not say anything about the probability of the interval. Remember that the CI is computed using a cumulative (normal for the mean) distribution and thus it refers to the long run probability. This means that, by repeating the same experiment on subjects from the same population and using the same sample size, the 95% CI will include the population mean 95% of the time on average. This also implies that the CI does not say anything about the actual sample mean. This is easy to demonstrate using a simulation: first generate a reference population, then take a sample and compute the 95% CI of the sample mean and check the population mean is in this interval. [Matlab code]

Define a population of 1 million individuals mean 5
True_mean = 5;
Pop = randn(1,1000000)+True_mean;
   Sample sizes of interest

SS = [10:10:100 200 300 400 500 600 700 800 900 1000];
N = length(SS);

Simulate 10000 repeated experiments

alphav = 5/100; % this is the a priori level of acceptance (Newman-Pearson)
NbMC = 10000;
% simulate 10000 repeated experiments
Errrors = NaN(N,NbMC);
% Errors is the matrix of CI that does not include the true mean

f = waitbar(0,
'Number of simulations','name','MC');
parfor MC = 1:NbMC % for each simulation
    X = randi(1000000,1,SS(end));
% draw 1000 subjects from Pop
    CI_ttest = NaN(2,N);
for s=1:N % and compute the 95% CI of the mean for multiple sample size
        tmp = Pop(X(1:SS(s)));
        Deviation = icdf(
'T', 1-alphav/2,SS(s)-1)*(std(tmp)./sqrt(SS(s)));
        CI_ttest(1,s)= mean(tmp) - Deviation;
        CI_ttest(2,s)= mean(tmp) + Deviation;
    Errrors(:,MC) = ((True_mean <= CI_ttest(2,:)) + (True_mean >= CI_ttest(1,:))) ~= 2;

subplot(3,3,[1 2 4 5]); imagesc(NbMC,SS,Errrors); colormap(
'experiment #','FontSize',14); ylabel('Sample Size','FontSize',14);
'Errors: mean=' num2str(mean(mean(Errrors,2)))],'FontSize',16);
'The average error across all sample size is %g\n', mean(mean(Errrors,2)))
subplot(3,3,[7 8]); plot((sum(Errrors,1))); ylabel(
'# of errors','FontSize',14); grid on; box on;
subplot(3,3,[3 6]); plot(SS,mean(Errrors,2),
'ro'); hold on
'LineWidth',3); title('type 1 error rate','FontSize',14);
axis([0 1000 .04 .06]); set(gca,
'xdir','reverse'); camroll(90); grid on; box on;

The simulation results are spot on, 5% of the time the CI did not include the true population mean.

The capture percentage of the sample mean

What about replications? If the 95% CI of the sample mean, include the true population mean 95% of the time, surely it will also include the mean of a replication study.  From theoretical analysis, (Cumming & Maillardet, 2006) showed that, on average, a new study with the same sample size has 83% chance to have the sample mean in the 95% CI of the 1st study. The capture percentage can easily be obtained by finding the alpha level which gives a CI with the expected coverage - in Matlab: 

CP = 95; % we want 95% coverage of the mean
value = icdf('T',1-CP/100,SS-1)*sqrt(2); % get the critical T values for sample size SS
alphav = tcdf(value,SS-1); % get adjusted alpha values

The problem is that most of the time, the sample size will differ. A power analysis for instance will indicate to recruit more or less subjects – what are the chances to be in the 95% CI then? Again, we can run a little simulation: take a sample and compute the 95% CI of the sample mean, take a new sample a check if the new sample mean is within the 95% CI. Do that for multiple combinations of sample sizes.

replicate = NaN(N,N,NMC); % binary indicating if study to sample mean is within study 1 CI

f = waitbar(0,
'Number of simulations','name','MC');
parfor MC = 1:NbMC
% 1st study
    X = randi(1000000,1,SS(end));
% draw subjects from Pop
    Data = NaN(SS(end),N); index = 1;
% the data are from 10, 20, 30, 40 .. to 1000 subjects
for s=1:N
        Data(1:SS(s),index) = Pop(X(1:SS(s)))';
        index = index+1;
    [~,~,CI1] = ttest(Data);
% compute the 95% CI

% 2st study
    X = randi(1000000,1,SS(end));
% draw subjects from Pop
    Data = NaN(SS(end),N); index = 1;
% the data are from 10, 20, 30, 40 .. to 1000 subjects
for s=1:N
        Data(1:SS(s),index) = Pop(X(1:SS(s)))';
        index = index+1;
    mean_values = repmat(nanmean(Data,1)',1,N);

% test if the mean values are within the 95% CI of the 1st study
    replicate(:,:,MC) = ((mean_values >= repmat(CI1(1,:),N,1))+(mean_values <= repmat(CI1(2,:),N,1)))==2;
close (f)

% replicate dimensions are
% dim1 (rows) the mean values of study 2 for N=10:10:100
% dim 2 (columns) the replications of these values to match CI of study 1 for N=10:10:100
% dim 3 are the 10000 random draw

percentage_correct = mean(replicate,3).*100;
% for equal sample sizes we have the expected 83%
EqualSS = diag(percentage_correct);

figure; colors = jet; colors = colors(1:floor(64/N):64,:); set(gcf,
'Percentage correct - mean(diag)=' num2str(mean(EqualSS))],'Fontsize',16);
'study 1 sample size','Fontsize',12);
'study 2 sample size','Fontsize',12);
for s=1:N
'LineWidth',3,'Color',colors(s,:)); hold on;
axis([0 SS(end) 1 100]); plot(SS,repmat(83,1,N),
on; box on; title('Percentage correct','Fontsize',16);
'study 1 sample size','Fontsize',12);
'percentage correct','Fontsize',12);
%fig2plotly(gcf,'strip',false,'offline', true);

The results show that as little as ~15% of replications will provide a sample mean that falls within the 95% CI of the original sample, and if many more subjects are collected, it will be 100% correct. In short, we cannot predict what value the sample mean is going to be based on CI. What I take from that is CI are useful to show multiple conditions / differences, because this reflect a statistical test, but that's about it.


 Morey, Hoekstra, Rouder, Lee, & Wagenmakers (2015). The fallacy of placing confidence in confidence intervals. Psy Bulletin and Review, 1-21

Cumming & Maillardet (2006). Confidence intervals and replication: Where will the next mean fall? Psy Method 11, 217-227

see also

Blog Archive