Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start of Sprint 2 #49

Open
wants to merge 69 commits into
base: master
Choose a base branch
from
Open

Start of Sprint 2 #49

wants to merge 69 commits into from

Conversation

sras
Copy link
Contributor

@sras sras commented Nov 4, 2016

Things planned

1. Solve the createdAt/updatedAt problem -- 10 Hrs

Use either of these options

  1. Make the fields non-writable in TableW definitions and use default column values and triggers to update these fields directly in the DB.

  2. Keep these fields writeable and use a wrapper function over insertManyAndReturning and runUpdate which sets & updates these columns automatically (this reducing boilerplate)

2. Make the id field non-writable in the TableW mappings. However, it should be readable. Opalaye should never be writing out primary keys of tables. -- 1 Hr

3. Use lenses in all the domain API functions so that you don’t have to pattern-match the record constructor everywhere -- 2 Hrs

4. Implement audit logs -- 10 Hrs

Use ether of two options:

  1. Using Posgres triggers
  2. Using wrapper functions over insertManyAndReturning and runUpdate that somehow track the changes being made to the record and insert a row in the audit_logs table

@sras sras force-pushed the feature/spock-opaleye branch from 3f60c8b to e7fef1e Compare November 7, 2016 09:34
@sras sras force-pushed the feature/spock-opaleye branch from 94a40a7 to df887ed Compare November 9, 2016 05:33
Copy link
Contributor

@saurabhnanda saurabhnanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • What's the advantage of splitting the Opaleye boilerplate into OpaleyeTypes and OpaleyeDef and DataTypes?
  • Can we convert all identifier (functions, variables, etc) to camelCase instead of snake_case? What does the Haskel style guide say?
  • Use a helper function to remove the following boilerplate from all places:
  return $ case r of
    []     -> Nothing
    (x:xs) -> Just x

, HasId columnsR (Column PGInt4)
)
=> Connection -> Table columnsW columnsR -> item_id -> haskells -> IO haskells
update_row conn table it_id item = do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove it_id from the function signature. We're already assuming that we have a field called id in the item itself. What is the advantage of passing the id separately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing the id separately explicitly specify which record to update. The record contains the data that the row should be updated with. The id specify the row that should be updated. Since those are separate things, I think those must remain separate.

Imagine a case when someone have a record where they have made some updates to the fields, and want to save to a certain Id. But imagine that they missed to change the id field. When they make the call, the wrong record is going to get updated. There is nothing warning the user that they didn't specify the id of the row that should be updated, because that was implicitly assumed to be the id that came with the updated record.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine a case when someone have a record where they have made some updates to the fields, and want to save to a certain Id.

Whoa... stop right there. You don't get to "pick" which record should be saved to which ID! The ID is part of the record and self-identifies its row.

But imagine that they missed to change the id field.

Which is the reason why ID should be a readonly field in the first place. I have yet to come across a valid use-case where one is mutating the primary key (ID) of a record -- either in the DB itself, or a record/object which represents the row in the DB.

Do you have a valid use-case for doing this?

Copy link
Contributor Author

@sras sras Nov 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are using an ORM that provides an Identity map and in a language with mutable values, you can mutate the fields of the object and call the update method to sync itself with the database.

But in Haskell, owing to it's immutable nature, this pattern is not possible. So in haskell, there cannot be a unique instance that represent the data for an entity. For example, a User record cannot be made to uniquly represent a row in the user table. Because that would require that another user record that represent the same row does not exist at the same time.

In a ORM with an Identity map, if a = user.post and user2 = a.author, then user and user2 refer to the same object. But in haskell, user and user2 can be different records, even though they hold same values, including the ID.

This means, we cannot treat records as authoritative representations of the underlying domain objects, and hence cannot use patterns that depend on this assumptions IMHO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in Haskell, owing to it's immutable nature, this pattern is not possible. So in haskell, there cannot be a unique instance that represent the data for an entity. For example, a User record cannot be made to uniquly represent a row in the user table

What would you call the record-type that Opaleye returns, which has the ID inside of it? The ID is the database identity of the record, which is what we're bothered with.

In fact, if we want to be pedantic about it, I would NOT EXPORT the id setter lens at all. No one should be able to change the id of a record which represents a row in the DB. It has no practical purpose, and will result in bad things.

This means, we cannot treat records as authoritative representations of the underlying domain objects, and hence cannot use patterns that depend on this assumptions IMHO.

As soon as the DB moves out of the database and goes to any other "temporary holding area" (record in Haskell, data-structure in Redis, etc.) it is no longer an "authoritative representation" at all. So, I'm unsure of how this really elucidates the Haskell or non-Haskell way of doing things.

And what exactly are we trying to prevent here? Can you give a real-life example of the use-case you are envisioning?

Imagine a case when someone have a record where they have made some updates to the fields, and want to save to a certain Id.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ID is the database identity of the record, which is what we're bothered with.

A database row can only have one state at a given time. If you want a in-memory representation of a database row, you have to make sure that there are no more than one instance of it at the same time in the scope. If you cannot gaurantee that, then you will have inconstant state in your app, when there are conflicting representations of the same database row, and upon saving to the db, you will lose data.

In fact, if we want to be pedantic about it, I would NOT EXPORT the id setter lens at all. No one should be able to change the id of a record which represents a row in the DB. It has no practical purpose, and will result in bad things.

I am not sure preventing updates to id will make any difference. Because either way, you are not mutating in place. So there is really no difference between updating id, and updating all the fields while keeping the id same. Because they both are different instances.

There is no difference between the result of the following operations

Starting with two User records,

u1 = User {id = 10, name = "max", age= 20}
u2 = User {id = 15, name = "bob", age= 25}

Updating the id field of u1

u3 = u1 & id .~ 15

is equalent to updating the name and age of u2 to that of u1, while keeping the id same

u3 = u2 & name .~ "max" & age .~ 20

In both cases u3 has same values and is a different instance from either u3 or u2.

As soon as the DB moves out of the database and goes to any other "temporary holding area" (record in Haskell, data-structure in Redis, etc.) it is no longer an "authoritative representation" at all.

It is still authoritative representation even after it leaves the DB. In fact, once it leaves the DB, the row in the DB cease to be the authoritative representation, because the data might be getting mutated as it passes through the app. The row in the DB won't be valid until the entity gets saved back to the database.

And what exactly are we trying to prevent here? Can you give a real-life example of the use-case you are envisioning?

What we are trying to prevent is ending up with inconsistent data. I gave you an example earlier. You reply was

Whoa... stop right there. You don't get to "pick" which record should be saved to which ID! The ID is part of the record and self-identifies its row.

But that breaks down as soon as you have multiple instances of a record with the same ID.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that breaks down as soon as you have multiple instances of a record with the same ID.

I'm sorry, but I'm not following your argument here. If you really want to get behind this argument, then the data-structures should be modelled on the lines of what Persistent has; wherein the identity (primary key) is not part of the value-record (other fields of the row). However, in Opaleye the ID is part of the record itself. So, there is no point mixing the two paradigms of thought over here.

Inconsistency is written all over this function signature:

updateTenant :: TenantId -> Tenant -> IO Tenant

There are two possible places where TenantId can be taken from. Which one is right? Why are there two places in the first place?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we are trying to prevent is ending up with inconsistent data. I gave you an example earlier.

I meant an actual piece of code that you would genuinely write this way:

  • Fetch a record from the DB (and the record contains an ID field)
  • Change the values, but not the ID to obtain a new record (because of immutability and all that)
  • Write the record back to the DB, but to overwrite a different row

@@ -58,50 +40,25 @@ deactivate_tenant :: Connection -> Tenant -> IO Tenant
deactivate_tenant conn tenant = set_tenant_status conn tenant TenantStatusInActive

set_tenant_status :: Connection -> Tenant -> TenantStatus -> IO Tenant
set_tenant_status conn tenant status = update_tenant conn (tenant_id tenant)
tenant { tenant_status = status }
set_tenant_status conn tenant st = update_tenant conn (tenant ^. id) (tenant & status .~ st)

update_tenant :: Connection -> TenantId -> Tenant -> IO Tenant
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see my earlier comment about passing id separately from the record (which already contains the ID).

Copy link
Contributor Author

@sras sras Nov 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Opaleye files were split to resolve a circular imports issue that I came across. But it does not seem to be an issue any more and I have merged them back.

Will convert code from snake_case to camelCase.

@saurabhnanda
Copy link
Contributor

Use a helper function to remove the following boilerplate from all places:
return $ case r of
[] -> Nothing
(x:xs) -> Just x

Just check if we already have access to a safeHead or a headMaybe function from a package that we're already using.

Copy link
Contributor

@saurabhnanda saurabhnanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some commentary on how code is logically divided between the following files:

  • ApiBase
  • DataTypes
  • OpaleyeDef
  • Validations

Possible to rename the following modules?

  • RoleApi -> Domain.Role
  • TenantApi -> Domain.Tenant
  • UserApi -> Domain.User

import Data.Aeson (Value(..))
import qualified Data.HashMap.Strict as HM

type AppM a = WriterT String (ReaderT (Connection, Maybe (Auditable Tenant), Maybe (Auditable User)) IO) a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible to put some comments in the code itself to describe the use of each of the components in this type:

  • WriterT String
  • ReaderT
  • Auditable Tenant
  • Auditable User

_ -> error "Not a type syn"

makeAudtableLenses :: Name -> Q [Dec]
makeAudtableLenses tq= do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in function name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible to put a longish comment in the code itself describe what kind of lenses using this TH function will generate?

type Role = RolePoly RoleId TenantId Text (NonEmpty Permission) UTCTime UTCTime
type RoleIncoming = RolePoly () TenantId Text (NonEmpty Permission) () ()

data Auditable a = Auditable { _data:: a, _log:: Value } deriving (Show)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we exporting the Auditable constructor? Should we be exporting it? Should it be possible to extract an x out of an Auditable x? Is there anything that a user can do on x that he can't do on Auditable x? Can the programmer take an x and manually create an Auditable x out of it? Isn't that a highly unsafe operation, defeating the purpose of audit logs in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Agree on all points. I am already aware of this. Will address it in the coming days...

putUpdatedTimestamp :: (HasUpdatedat item (UTCTime)) => UTCTime -> item -> item
putUpdatedTimestamp timestamp = updatedat .~ timestamp

updateAuditableRow :: (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be in a DB transaction? Have we solved for DB transaction yets?

currentTime <- liftIO getCurrentTime
let updatedItem = (putUpdatedTimestamp currentTime) audti
let Auditable { _data = item, _log = _log} = updatedItem
_ <- updateDbRow table (constant itId) (constant item)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we really be ignoring the result? Isn't it better to use UPDATE... RETURNING * and returning the updated record wrapped in a fresh auditable wrapper?

let Auditable { _data = item, _log = _log} = updatedItem
_ <- updateDbRow table (constant itId) (constant item)
insertIntoLog table itId "" _log
return audti
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the in-memory audit log being blanked out after a successful update or insert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I completely missed that. Will do.

updateDbRow table row_id item = do
auditLog $ "Update :" ++ (show item)
conn <- getConnection
_ <- liftIO $ runUpdate conn table (\_ -> item) (matchFunc row_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we be using runUpdateReturning here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That would be ideal. But since we already hold the record in its updated state, I thought, why do the additional fetching. If there was a failure with update, we will get an exception anyway, right?

I am not sure, but I think the use case of "update returning" would be if we are doing an operation on existing value, like "update product set quantity = quantity + 1", and in that case, returning the new values can save an additional select.

But here we are not doing anything like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel kts better to use updateReturning because it allows us to represent the latest state of the row in the DB, which might be different than what is in the memory, possibly due to DB triggers.

@saurabhnanda
Copy link
Contributor

Overall the Auditable interface looks like a great attempt. Did you already know TH, or did you learn it to solve this problem?

@sras
Copy link
Contributor Author

sras commented Nov 19, 2016

@saurabhnanda

I didn't know TH before. Learned it for this solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants