Ahead of his talk at Generate Sydney, the man behind ‘A Dao of Web Design’ offers a glimpse into the internet’s future.
Don’t miss John Allsopp’s opening session – Predict the future with this one weird old trick – at Generate Sydney on 5 September. Can’t get there? There are Generate conferences coming up in San Francisco and London too!
If anybody should be able to predict the future, John Allsopp is more qualified than most. He is, after all, the man who just might have coined the inglorious term ‘Web 2.0′ (he can document his usage of it before Tim O’Reilly popularised the phrase).
Allsopp is also feted as one of responsive design’s founding fathers. His essay, ‘A Dao of Web Design‘ was published in 2000. In it he encourages designers to let go of print’s rigidity and embrace the web’s fluidity. Ethan Marcotte cites Allsopp’s essay as one of his key inspirations.
So what does Allsopp – who’s speaking at our Generate Sydney event in September – predict for the next century? You may be somewhat surprised by what he has to say…
Read the article here: Creative Bloq
The European Medicines Agency published it’s first issue of QPPV Update.
The European Medicines Agency (EMA) is a decentralised agency of the European Union (EU), located in London. It began operating in 1995. The Agency is responsible for the scientific evaluation, supervision and safety monitoring of medicines developed by pharmaceutical companies for use in the EU.
The issue provides Qualified Persons responsible for Pharmacovigilance (QPPVs) and all other people working in pharmacovigilance with an update on EU Pharmacovigilance.
Main content of it covers news regarding these topics, and what does it mean to you as someone working in Pharmacovigilance:
- “Pharmacovigilance in the product lifecycle” including: Risk Management Planning critical for patient safety and product innovation, Companies encouraged to seek scientific ad-vice for PASS, Initiative on Patient Registries.
- “Pharmacovigilance Pro-cesses” including topics such as: Quicker product information updates, Medical literature monitoring, Reliance on Article 57 data, Joint PRAC/CHMP assessment reports, Measuring the impact of pharmacovigilance activities.
- Pharmacovigilance guidance: a table that represents the latest adopted, planned, and under development guidance.
- “Pharmacovigilance IT Systems” covering: Article 57 database, EudraVigilance Auditable Requirements, PSUR Repository, Pharmacovigilance Fees.
- Pharmacovigilance dialogue about upcoming EMA events.
Download the issue here: EMA Site
The well-known quote from Andrew Lang reads as follows: „The statistician uses statistics as a drunken man uses lamp posts—for support rather than illumination.”. It is easy for a mathematician or a statistician to interpret the result of a statistical analysis with caution, but one, who is only interested in the result and less familiar with the mathematical background of the used methods, can easily jump to a wrong conclusion. The simplest example, which points out, why prudence is needed in the implementation of statsitical results, is the Simpson’s paradox (described firstly by Edward H. Simpson in 1951)
Consider the following study. A new drug is being tested on a group of 800 people (400 men and 400 women) with a particular disease. The aim is to establish whether there is a link between taking the drug and recovery from the disease. In a standard scenario half of the people (randomly selected) are given the drug and the other half are given placebo. The results in the following table show that, of the 400 given the drug, 200 (50 %) recover from the disease; this compares favourably with just 160 out of the 400 (40 %) given the placebo who recover.
Drug taken | No | Yes |
Recovered | ||
No | 240 | 200 |
Yes | 160 | 200 |
Recovery rate | 40% | 50% |
So clearly we can conclude that the drug has positive effect. Or can we? A more detailed look at the data results in exactly the opposite conclusion. Specifically, the following table shows the results when brokan down into male and female subjects.
Sex | Female | Male | ||
Drug taken | No | Yes | No | Yes |
Recovered | ||||
No | 210 | 80 | 30 | 120 |
Yes | 90 | 20 | 70 | 180 |
Recovery rate | 30 % | 20 % | 70 % | 60 % |
Focusing first on he men, we find that 70 % taking the palcebo recover, but only 60 % taking the drug recover. So, formen, the recovery rate is better without the drug. Similarly, with the women we find that 30 % taking the palcebo recover, but only 20 % taking the drug recover. So, for women, the recovery rate is also better without the drug. So we can conclude, in every subcategory the drug is worse than the placebo.
The process of drilling down into the data this way (in this case by looking at men and women separately) is called stratification. Simpson’s paradox is simply the observation that, on the same data, stratified versions of the data can produce the opposite result to non-stratified versions. Often, there is a causal explanation. In this case men are much more likely to recover naturally from this disease than women. Although an equal number of subjects overall were given the drug as were given the placebo, and although there were an equal number of men and women overall in the trial, the drug was not equally distributed between men and women. More men than women were given the drug. Because of the men’s higher natural recovery rate, overall more people in the trial recovered when given the drug than when given the placebo.
Someone may ask the questions, ’Does this difficulty arise in more general case (e.g. if we stratify the data into more subgroup)? ’ or ’How can we avoid this kind of effects?’. For answers, an more details please refer the following articles:
Definition of Customer Segments
Customer segmentation has undoubtedly been one of the most implemented applications in data analytics since the birth of customer intelligence and CRM data.
The concept is simple. Group your customers together based on some criteria, such as revenue creation, loyalty, demographics, buying behavior, or any combination of these criteria, and more.
The group (or segment) can be defined in many ways, depending on the data scientist’s degree of expertise and domain knowledge.
- Grouping by rules. Somebody in the company already knows how the system works and how the customers should be grouped together with respect to a given task, e.g. a campaign. A Rule Engine node would suffice to implement this set of experience-based rules. This approach is highly interpretable, but not very portable to new analysis. In the presence of a new goal, new knowledge, or new data the whole rule system needs to be redesigned.
- Grouping as binning. Sometimes the goal is clear and not negotiable. One of the many features describing our customers is selected as the representative one, be it revenues, loyalty, demographics, or anything else. In this case, the operation of segmenting the customers in groups is reduced to a pure binning operation. Here customer segments are built along one or more attributes by means of bins. This task can be implemented easily, using one of the many binner nodes available in KNIME Analytics Platform.
- Grouping with zero knowledge. We can assume that the data scientist frequently does not know enough of the business at hand to build his own customer segmentation rules. In this case, if no business analyst is around to help, he should resolve to a plain blind clustering procedure. The after-work for the cluster interpretation belongs to a business analyst, who is (or should be) the domain expert.
With the set goal of making this workflow suitable for a number of different use cases, we chose the third option.
There are many clustering procedures and KNIME Analytics Platform makes them available in the Node Repository panel, in the category Analytics/Mining/Clustering, e.g. k-Means, nearest neighbors, DBSCAN, hierarchical clustering, SOTA, etc … We went for the most commonly used: the k-Means algorithm.
Read more: KNIME.ORG
SatRdays are community-led, regional conferences to support collaboration, networking and innovation within the R community. The initiative of Steph Locke and Gergely Daroczi was accepted and funded by the R consortium. The very first event of this series took place in Budapest, Hungary on September 3, 2016 with almost 200 attendees of 19 countries and 12 hours of pure R fun. The day began with various workshops, followed by two keynotes and several regular talks, and ended with a data visualization challenge. The complete schedule can be found on the conference website, http://budapest.satrdays.org . The talks were live-streamed and can be watched online: http://www.ustream.tv/channel/xFdxHeVnGKS . If you have only limited time, we recommend the following talks: 1st keynote by Gábor Csárdi (R package history), Romain François‘ question section (including a marriage proposal), 2nd keynote by Jeroen Ooms (HTTP requests, ImageMagick) and data sonification by Thomas Levine. In overall, the first satRdays event received very positive feedback from the R community, and started to establish the reputation of the series. Personal thoughts about the conference from the main organizer were published at https://www.r-consortium.org/news/blogs/2016/09/start-satrdays .
“With the advent of data science and the increased need to analyze and interpret vast amounts of data, the R language has become ever more popular. However, there’s increasingly a need for a smooth interaction between statistical computing platforms and the web, given both 1) the need for a more interactive user interface in analyzing data, and 2) the increased role of the cloud in running such applications.
Statisticians and web developers have thus seemed an unlikely mix till now, but make no mistake that the interactions between these two groups will continue to increase as the need for web-based platforms becomes ever more popular in the world of data science. In this regard, the interaction of the R and Shiny platforms is quickly becoming a cornerstone of interaction between the world of data and the web.
In this tutorial, we’ll look primarily at the commands used to build an application in Shiny — both on the UI (user interface) side and the server side. While familiarity with the R programming language is invariably helpful in creating a Shiny app, expert knowledge is not necessary, and this example will cover the building of a simple statistical graph in Shiny, along with some basic commands illustrating how to customize the web page through HTML.” – Michael Grogan
Continue to tutorial: SitePoint