Tuesday, April 30, 2013

Type safety is not enough for safe Web development

 
A response in Stack Overflow about Looking for a type-safe web development platform
 
Type safeness does not solve all the problems and errors that an application may have. The main problems of web development is the stateless nature of HTTP, that forces an event handling programing model, full of unsafe identifiers of variables and event handlers refered here and there. The state most of the time is in the form of hash-tables of dynamically typed data, since the event handlers do not share variable scopes. Therefore state management is a nightmare, and code readability is not good since the event handling model is a design level goto.
 
Here, strong types do not help, except in the case of continuation based frameworks, like ocsigen (ocaml) and seaside (smalltalk). They handle nicely the back button, they keep state in normal variables and the navigation can be understood by reading the code. And they are mostly RESTFul to a certain level. But these frameworks are not scalable and have persistence problems by the inherent problems of continuations.
 
The other problem of web applications is the typeless nature of HTML, which can produce mismatches and runtime errors. But in this case all the Haskell web frameworks and many others do a fair job.
 
In MFlow not only each page, but the entire navigation is safe at compile time and it does not share the above problems. It has the nice properties of continuation based frameworks, but it is scalable, since it uses logging and backtracking instead of continuations. It uses standard Haskell web libraries: WAI, formlets, stm, blaze-html. It has a system of pluggable self-contained components.
 
This is a complete application with three pages. In a loop, it ask for two numbers and show the sum. you can press the back button as you please. There is no magic identifiers that you have to put here an there, in configuration files, pages and source code:
 
module Main where
import MFlow.Wai.Blaze.Html.All

main= do
  addMessageFlows  [("sum", transient . runFlow $ sumIt )]
  wait $ run 8081 waiMessageFlow

sumIt= do
  setHeader $ html . body
  n1 <- ask $  p << "give me the first number" 
               ++> getInt Nothing 
               <** submitButton "send"

  n2 <- ask $  p << "give me the second number" 
               ++> getInt Nothing 
               <** submitButton "send"

  ask $ p << ("the result is " ++ show (n1 + n2)) ++> wlink () << p << "click here"
  
 
The state can be made persistent with little modifications (by adding "step . ask", instead of ask,  and deleting "transient ." )

The event handling model is evil for web applications

This is a response in stack Overflow about What is the Haskell response to Node.js?

IMHO events are good, but programming by means of callbacks is not.

Most of the problems that makes special the coding and debugging of web applications comes from what makes them scalable and flexible. The most important, the stateless nature of HTTP. This enhances navigability, but this imposes an inversion of control where the IO element (the web server in this case) call different handlers in the application code. This event model -or callback model, more accurately said- is a nightmare, since callbacks do not share variable scopes, and an intuitive view of the navigation is lost. It is very difficult to prevent all the possible state changes when the user navigate back and forth, among other problems.

It may be said that the problems are similar to GUI programming where the event model works fine, but GUIs have no navigation and no back button. That multiplies the state transitions possible in web applications. The result of the attempt to solve these problem are heavy frameworks with complicated configurations plenty of pervasive magic identifiers without questioning the root of the problem: the callback model and its inherent lack of sharing of variable scopes, and no sequencing, so the sequence has to be constructed by linking identifiers.

There are sequential based frameworks like ocsigen (ocaml) seaside (smalltalk) WASH (discontinued, Haskell) and mflow (Haskell) that solve the problem of state management while maintaining navigability and REST-fulness. within these frameworks, the programmer can express the navigation as a imperative sequence where the program send pages and wait for responses in a single thread, variables are in scope and the back button works automatically. This inherently produces shorter, more safe, more readable code where the navigation is clearly visible to the programmer. (fair warning: I´m the developer of mflow)

The callback model is a design-level goto, and we will look back with relief when we finally get rid of it

Wednesday, April 24, 2013

MFlow: What about the data tier? Adding it to the shopping example

In the previous post I created an skeleton of a shopping application where the back button in the web browser is used for navigation, instead of a way to undo a transaction.  The purpose here is to add a small database of products, to search the database by different ways. This is the original  program, where three products are hardcoded:

module Main where
import MFlow.Wai.Blaze.Html.All
import Data.Typeable
import Data.String(fromString)

import qualified Data.Vector as V
main= do
   addMessageFlows  [("", runFlow $ shop ["iphone","ipad","ipod"])]
   wait $ run 8081 waiMessageFlow

shop products= do
   setHeader $ html . body
   setTimeouts 120 (30*24*60*60)
   catalog
   where

   catalog = do
           bought <-  step . ask $ showProducts products
           cart <-  getSessionData `onNothing` return emptyCart
           let n = cart V.! bought
           setSessionData $ cart V.// [(bought,n+1)]
           step $ do
             r <- ask $  do
                   cart <- getSessionData `onNothing` return emptyCart
                   p << showCart cart ++> wlink True << b << "continue shopping"
                                      <|> wlink False << p << "proceed to buy"


             if( r== False) then ask $ wlink () << "not implemented, click here"
                          else return ()

           catalog
   emptyCart= V.fromList $ take (length products) (repeat  (0::Int))
   showProducts xs= firstOf $ map (\(i,x) -> wlink i (p <<  x)) $ zip  [0..] xs

 


First, some headers:

{-# LANGUAGE DeriveDataTypeable #-}
module Main where
import MFlow.Wai.Blaze.Html.All as MF
import Data.Typeable
import Data.String(fromString)
import Data.TCache.DefaultPersistence
import Data.TCache.IndexQuery as Q
import Data.TCache.IndexText
import Data.Maybe
import qualified Data.Map as M
import Control.Workflow.Configuration
import Data.Text.Lazy as T


Lets define a product, and we will create some of them:

type ProductName= String
type Quantity= Int
type Price= Float


data Product= Product{ namep :: ProductName
                     , typep :: [String]
                     , descriptionp :: String
                     , pricep :: Price
                     , stock :: Int}
              deriving (Read,Show,Typeable)



createProducts= atomically $ mapM newDBRef
    [ Product "ipad 3G"   ["gadget","pad"]   "ipad 8GB RAM, 3G"       400 200
    , Product "ipad"      ["gadget","pad"]   "ipad 8 GB RAM"           300 300
    , Product "iphone 3"  ["gadget","phone"] "iphone 3 nice and beatiful"  200 100
    ]



We will use TCache for persistence.  It adds STM transactions, and user defined persistence. It also bring default persistence in files, that is what I will use now.

newDBRef (above) creates a database reference and the record pointed too. To do so  we need only to define the indexable instance for the record, that defines a unique key:

instance Indexable Product where
   key prod= "Prod "++ namep prod


We need to search by the product name namep, to search product by types typep and to perform text search in the description descriptionp.

main= do
   Q.index namep                          -- for field indexation
   indexList typep (Prelude.map T.pack)   -- for list indexation
   indexText descriptionp T.pack          -- for text indexation


These indexation statements create triggers that inspect creations and modifications of these fields in the registers.  (Use http://holumbus.fh-wedel.de/hayoo/hayoo.html for fast documentation about these and other primitives used in the application)

Then, we create the products one and one single time.

   runConfiguration "createprods" $ once createProducts

runConfiguration is a Workflow functionality for configuration. The purpose is to execute (once)  things one and one single time.

And the start of the shop, that will be defined later:

   addMessageFlows  [("", runFlow shop )]
   wait $ run 8081 waiMessageFlow



Lets redefine the shopping cart. Instead of a Vector, we will have the product name, the quantity and the unit price in a map:

type Cart= M.Map ProductName (Quantity, Price)
showCart :: Cart -> String
showCart = show


 shop = do
   setHeader $ html . body
   setTimeouts 120 (30*24*60*60)
   catalog
  


The catalog loop will do the same than in the previous static application: The user choose a product from the catalog, the shopping cart will be shown with the new product and so on:

   catalog = do       
       bought <- buyProduct
       shoppingCart bought
       catalog


Now the reservation (buyProduct) and the shoppingCart processing is more complex and with more steps:

 Let's tell something about Workflows and step. This statement writes the result of a computation in a log. When the program is restarted, each step call will read an entry in the log and return it, instead of executing the computation.  So the program will execute the steps already logged, so at the end of the log, the process instruction pointer is located after the last step saved. the shop computation will need to remember the state of the shopping cart after restar from a timeout set with setTimeouts (see previous posts about this), so it need to use step.

buyProduct return a product name. Now there are two options: Either a search for products or a navigation trough product types. To obtain the types, it is necessary to get all of them from all the products. This is what allElemsOf does.

   atomic= liftIO . atomically
   showList []= wlink Nothing << p << "no results"
   showList xs= Just <$> firstOf [wlink  x << p <<  x | x <- xs]

   buyProduct = step $ do
        ttypes   <atomic $ allElemsOf typep
        let types= Prelude.map T.unpack ttypes
        r  <- ask $   h1 << "Product catalog"
                  ++> p << "search" ++> (Left <$> getString Nothing)
                  <|> p << "or choose product types" ++>  (Right <$> showList types)


So r will have either the search string (by getString) or  what showList produces. And what it produces is Nothing if the list of types is empty or one of the links pressed -wlink- corresponding to a product name. firstOf  applied to a list of widgets, return the one activated by the user.

And now given either the search string or the type of product, it is necessary to read the product that meet the condition in the database:

   prods <case r of
      Left str           -> atomic $ Q.select namep $ descriptionp `contains` str
      Right (Just type1) -> atomic $ Q.select namep $ typep `containsElem` type1
      Right Nothing      -> return []


Here there are two kind of queries: the first is the names of products which contains the search terms in the description field. The second query is all the products that include the type in the typep field

The result is the list of product names. If no  search result of no types of products,  we return to the page again:


      if Prelude.null prods then buyProduct else do

                let search= case r of
                    Left str ->    "for search of the term " ++ str
                    Right (Just type1) -> "of the type "++ type1

        r <- ask $ h1 << ("Products " ++ search) ++> showList prods
        case r of
              Nothing   -> buyProduct
              Just prod -> breturn prod


Here showList present the products as links. so r will return the chosen product or Nothing if there was no result for the previous query for products (It is not the case, since the that has been ruled out in the first line).

Now, the shoppingCart. First the shopping cart is retrieved form the session context with getSessionData (see the previous post). then, to know the price, the register with this name is retrieved. Then the cart is updated.

shoppingCart bought= do
       cart <- getSessionData `onNothing` return (M.empty ::Cart)
       let (n,price) = fromMaybe (0,undefined) $ M.lookup  bought cart
       (n,price) <- step $ do

                  if n /= 0 then return (n,price) else do
                    [price] <- atomic $ Q.select pricep $ namep .==. bought
                    return (n, price)
       setSessionData $ M.insert  bought (n+1,price) cart


Since namep determine the main key (see the Indexable instance), we would substitute the query

[price] <- atomic $ Q.select pricep $ namep .==. bought

By

price <- readResource Product{ namep= bought} >>= return . pricep . fromJust

Which is quite faster. So, really,  there is no need to index the main key. But it has been done in this  case for a matter of example. readResource gets an incomplete object with defined key and returns Maybe the complete register.

Finally, the shopping cart is visualized:

       step $ do
         r <- ask $ do

              cart <- getSessionData `onNothing` return (M.empty :: Cart)
              h1 << "Shopping cart:"
                ++> p << showCart cart
                ++> wlink True  << b << "continue shopping"
                <|> wlink False << p << "proceed to buy"


         if not r then ask $ wlink () << "not implemented, click here"             else breturn ()

 breturn  means that the procedure return, but may be called back when backtracking as a result of the button back pressed in the web browser.

Here there is no real "buy" operation implemented. The naming as "buy" operations what are really reservations of products in a shopping cart are just in order to maintain the names of the previous version. Actually, they are shopping cart reservations. Buy It means to modify the stock of the product in the database and avoid to roll-back when backtracking. I will do it in the next post.

 The code above is complete. This example is embedded in this source file:

https://github.com/agocorona/MFlow/blob/master/Demos/loginUserPage.hs

 The MFlow package:
https://github.com/agocorona/MFlow



Sunday, April 21, 2013

More on Session management in MFlow: getSessionData

In the previous post "controlling backtracking in MFLow" I told about how the backtracking mechanism match the request coming from a page in the navigation history with the piece of code that generated it, so that the flow could proceed from this point on. ask will backtrack to the previous ask until the parameters in the request match. This ask statement contains a computation in the View monad with a closure that contains the variable values that where used when the page was created. This is tbe essence of the backtracking mechanism. In a continuation-based framework, such is osigen from Ocaml or seaside from Smalltalk, all the navigation is exposed to the web server, so the request goes directly to the intended closure. In MFlow there is a single entry point which is a running process that will backtrack to the ask statement that match with the page if necessary. If the process is not running, it will be restarted.

However sometimes, the back button is used for navigation and we want to keep the state, not to roll it back. In the previous example of the shop, the cart is passed as a parameter. That may be tedious if the flow is complicated. To solve these two problems, I created getSessionData and setSessionData. which stores and retrieves user-defined data in the session, by means of a map indexed by data type. In this way the user don´t need to pass its application data by parameters, neither it need to embed its own state monad. But also, the programmer when pressing back, can choose to backtrack to previous state values  or not, depending on the nature of the flow.

This session data works in the Flow monad as well as the View monad. That means that it can be used in both sides of ask. Since when backtracking the only code executed is the View monadic code behind ask, getSessionData,  when it is in View monad, will retrieve the last state no matter if it is on backtracking. This is the way to keep the user-defined state actualized to the last value when backtracking.

In the example below, the shopping cart is stored as session data. In the ask statement that show the shopping cart, getSessionData is used in the View monad (in bold), so when backtracking  the code will get the last shopping cart, so no roll-back effect will appear and the back button can be used for navigation purposes.  Here is a  loop with an alternance between buying from the catalog and showing the shopping cart. The loop, and the navigation can be executed forward, by pressing the links, and backward, by means of the back button. The result is the same.

If getSessionData is not in the View monad, and the cart variable in scope is used, then this variable will keep its value that it had at each moment of the loop, so when going back we will see fewer and fewer items in the shopping cart.

Here step is used, so the state is persistent.  setTimeouts will kill the process after two minutes. After one month, if the user has not returned, the state will be erased and the shopping  cart will be empty. This is the complete program:

 
module Main where
import MFlow.Wai.Blaze.Html.All 
import Data.Typeable
import Data.String(fromString)

import qualified Data.Vector as V

main= do
   addMessageFlows  [("", runFlow $ shop ["iphone","ipad","ipod"])]
   wait $ run 8081 waiMessageFlow

shop products= do
   setHeader $ html . body
   setTimeouts 120 (30*24*60*60)
   catalog
   where

   catalog = do
           bought <- step . ask $ showProducts products
           cart <- getSessionData `onNothing` return emptyCart
           let n = cart V.! bought
           setSessionData $ cart V.// [(bought,n+1)]
           step $ do
             r <- ask $  do
                   cart <- span=""> getSessionData `onNothing` return emptyCart
                   p << showCart cart ++> wlink True << b << "continue shopping"
                                      <|> wlink False << p << "proceed to buy"


             if( r== False) then ask $ wlink () << "not implemented, click here"
                          else return ()

           catalog

   emptyCart= V.fromList $ take (length products) (repeat  (0::Int))

   showProducts xs= firstOf $ map (\(i,x) -> wlink i (p <<  x)) $ zip  [0..] xs 
 
 
MFlow : http://github.com/agocorona/MFlow
 
 

Saturday, April 20, 2013

Friction-Free democracy has born

The project for which I created my Haskell libraries has officially started:

http://frictionfreedemocracy.org

We are fundraising at rockethub.com



Imagine that you have to decide, with other people, about many small or big issues in your company, your town or your country; and you can, through a web application, choose whether to vote directly or to delegate your vote on a friend of yours who shares your point of view about this particular problem, or who is more knowledgeable about the issue at hand. Imagine that you can delegate on different persons depending on the topic, the group or the particular decision to be made. This persons can delegate their votes as well. Imagine that even yourself can receive delegated votes. But, in any case, anyone can later vote for themselves and override the vote of their delegates at any time during the voting process. This is called cascade delegation and revocable delegation.
Imagine that you can see the provisional results in real time, and you can change your vote during the voting period depending on these. This is called continuous-round voting.

Imagine that you can propose, or amend what others have proposed, and the people can vote the original proposal and the one you amended. This is called an open system of proposals and amends.
Imagine that you can create your own decision flow. Imagine that, for any collective decision to be made, you can configure, using menus, the form in which the decision should be written, which group of people should vote it, which percent of approval is required, which group of people will receive the proposal if it is approved or rejected and what they will do with it, and so on. This is an easy creation of decision flows.

Imagine that these decision flows can also be voted and amended, as a normal proposal would. Imagine that you have an application where you can define, again using simple menus, from the Constitution of a Country to the decision process of a particular working group in a company, or a small process for your friends to decide where to travel this summer. This is an open system for decision workflows.

Imagine that this process can run not on a central server, but on the computers of each and every person who wishes to connect to the cloud. For example, your own. Imagine that even after the voting process is finished, you can count again all the votes with your own software, so that there is no possible deceit. This is called direct verifiability.

Imagine that you can know how much each proposal costs you, and you can know in real time how the budget is spent. This is called direct budget and execution control.
Imagine that you can choose either to make your vote public to everyone, to a particular person or group of people (the people who delegated his vote to you, for example) or to make it anonymous. This is called customized confidentiality.

Imagine that you have access to a tool that, besides all that, can also automate any kind of decision process that is established now (He who can do more, can do less). You can limit the delegates, you can establish who will be able to propose, amend or vote. You can define weighted votes, quality votes, etc.

That would be the full implementation of the promises of the Internet for politics and decision making. That would mean to give back the power and the responsibility to the people. That would be to harness collective intelligence.

The Haskell programming language was chosen for the project from the very beginning, because it is high level, pure and functional. . . and fast. And, most of all, it is beautiful. That means that it is highly productive and one of the most verifiable languages. The latter is necessary when you have to trust in what the program does in a critical application such as electronic democracy.

The requirements of FFD are very demanding. If these requirements were to be met with current developments, a lot of coding, sometimes at a low level, would be needed; so we decided to start from the beginning.

From as early as 2005 we have been making mock-ups to test many concepts of FFD, like the system of proposals/amends, cascaded delegation, workflow traceability, versioning, revocable delegation and server synchronization.

In order to create such a demanding application, we needed a stack of libraries to carry out functionalities at a deeper level, so that various higher-level functionalities can be covered by a single deep-level functionality. For this purpose we have been developing several general purpose libraries, which are the base infrastructure of Friction-Free Democracy, and released them as open source.
For example, the Workflow library is in charge of definition, execution and workflow configuration. RefSerialize takes care of data compression and traceability of modifications. TCache carries out transactions and data persistence. MFlow executes web applications which use all these functionalities.

Thursday, April 18, 2013

A comparison of Web Frameworks: How they handle application state


This table summarizes my research about how different web frameworks handle state in web applications. Sorry for the bad formatting. As usual, it is the result of my ethernal, hopeless fight with blogger.com.  Additionally  this table is generated from a latex script passed trough an online converter. ( I´m too lazy to redo it again in HTML)

The event model, or state transition model is imposed by the inversion of control of the web architecture, where the  IO handler, the web server,  call different pieces of programmer code, instead of the other way around. This is imposed by the stateless nature of HTTP and the hyperlinked nature of HTML. But this inversion of control can be reverted back again, so that an intuitive, sequential programming style can be achieved for web applications without losing navigability. This happens in the continuation based frameworks - at the cost of scalability. 

WASH the old but fine web framework from  Peter Thiemann ran  state as a sort of small log that is replayed after each request so the program find its right location of code to respond to each request, so a WASH program would run even as CGI extension. MFlow uses such a small log, but the process stay running between request (unless timeout, in which case replay the log) and uses backtracking to match the request when back button is pressed.  Because a log is made of events and the events can be easily synchronized among machines, MFlow is scalable.
 

Web frame-
How it works

How
state/navigation
Problems


work





works









Event model
- State transition machine model



MVC:

MVC,

mostly
Add
session
state as
Event-based:
spaguetti
ASPX,

stateless.

Page
hash
tables.
flash
ob-
code. back button usually
JSP,
Rails,
oriented


jects(Rails)


does not work (roll-back
node.js,
Yesod,
HappStack,
Snap









state hardly possible)





Struts, JSF
MVC page and ses-
Add
basic configurable
Event based, complex, no


sion oriented

navigation (XML)

checks
at
compile time.










No back button support











Seam,

Flow

oriented
add
conversations,
sub-
Event
based,
complex-
Spring
Web
via XML config
flows, back button sup-
ity,No checks at compile
Flow

and

language
port



time





extensions



















Apache
MVC
with
inter-
add
serializable
page
Event
based.
No
wicket

face in java ob-
state. support back button
back
button
detec-


jects+HTML

(with appropriate browser
tion,serialized state may






configuration)


be big



















Procedural, sequential style




Coccon, sea-
Continuation-
serializable
execution
compile time checks vary.
side, ocsigen
based,
flow
in a
state, support back button
serialized
state
may be


single

procedude




big.
no
bookmarkable


(javascript,smalltalk,




URLs,
scalability prob-


Ocaml







lems





respectively)

















WASH

Monad


for
Log recreate the execu-
Each request implies a re-


creation-replay
tion state


play of the log. log in the


of the log, flow in




browser can be hackedv


a single
procedure










(haskell)



















MFlow
Monad


trans-
Log may recreate the ex-
Under test




formers
for
log-
ecution state.
Execu-






ging/recovery

tion state stay in memory.






and

navigation
Back button supported by






backtracking,
flow
backtracking. bookmark-






in a single Pro-
able URLs








cedure(Haskell),










checked at runtime.






















Flow state management on different Web Framewors

Wednesday, April 17, 2013

Controlling backtracking in MFlow

MFlow Web applications are defined like console applications, as sequences of interactions with the user. When the user press the back button and send a request from a previous page, the application backtrack to the statement that match the request, so that the application resume from this point on. That way, the state is synchronized with the user navigation. This may be the skeleton of a flow with a shopping page that manage a shopping cart


shop products= do 
   setHeader $ body . html
   setTimeouts 120 (30*24*60*60) 
   loop emptyCart 
   where 
   loop cart= do 
      r <- step . ask $ products <** showCart cart 
                      <+> wlink () << p << exit shop case r of 
      case r of 
       (Just bought,_) -> loop cart 
       _               -> breturn cart


This flow has a single ask statement, but it is within a loop and step make this result persistent. Let's see what this means. setTimeouts establishes how long the process is running in memory and how long the serialized session data is recorded, respectively. Each product selected on each ask request is stored in the log. If, after two minutes, the user select another product, the process will be restarted and will recover the shopping cart state by re-executing the loop, taking as input the log content until the log is finished. Unless the user does not enter for a month, in which case, the log will be deleted and the shopping cart will appear empty. When the user press the "exit shop" link, the flow will return the shop cart to the calling flow. In the previous example, it is noteworthy that, if the user is adding products to the shopping cart, when he press the back button, the previous page will appear, with one product less in the cart. In this page, when he select other product and send the request, the application will backtrack one step in the loop, so the shopping cart will roll back from the last transaction and the shopping cart will reflect the page that the user is seeing. This synchronization of state and pages occurs on every case thanks to the backtracking mechanism. So that if the user is in a shopping cart and go back, he see pages with less and less items. When the user decides to proceed from this point the state of the shopping cart -because of the backtracking- will match "magically" with what is displayed in the user page.

All this is fine, but sometimes it is dangerous to go back. For example when a transaction that can not be undone has been done. For example a payment. For this purpose, it is necessary a cut in backtracking.

 I added preventGoingBack, and pushed it to the Git repository. It perform this cut in backtracking when going back in the flow. When this condition is detected, ir executes a subflow that is the parameter. Usually this subflow consist of a single ask statement that present a single page with an error. When going forward, the statement is transparent:

    ask $ wlink () << b << "press here to pay 100000 $ "
    payIt
    preventGoingBack . ask $ b << "You payed 10000 once" ++> wlink () << b << " Please press here"
    ask $ wlink () << b << "Press the back button to verify that you can not pay again" 
    where
    payIt= liftIO $ print "paying"

Here preventGoingBack does not permit to navigate back to code before himself. The user can go back in the browser, but what the user will see when he press a button or link in the flow, is the message of preventGoingBack and their available options, after which the flow will proceed normally.

This example uses the last version of MFlow at https://github.com/agocorona/MFlow.

It uses the last version of Workflow https://github.com/agocorona/Workflow

TCache : https://github.com/agocorona/TCache

RefSerialize: https://github.com/agocorona/RefSerialize