Skip to content

Conversation

@almereyda
Copy link
Member

Refering to #426 Deployment of FLOSS document management system.

This pull request combines commits which provide a media management engine to the data service.

@almereyda almereyda self-assigned this Aug 9, 2017
@almereyda almereyda requested a review from acorbi August 9, 2017 15:18
@almereyda
Copy link
Member Author

@acorbi Please feel free to extend the description of this PR, if you see additional aspects which need to be present for merge.

Copy link

@acorbi acorbi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@almereyda See individual comment within the code plus the follwing questions:

  • I understand correctly that the Data API (data.transformap.co) serves as a proxy for the media-related API calls by exposing a /media/ endpoint additionally to the /place/ endpoint?

  • I do not see any references to the different versions of the media's metadata. I understand this is something not yet contemplated here and might be conceived at a later point. For reference, see my draft definition on https://github.com/acorbi/transformap-editor/blob/impl-mocking-libs/app/lib/MMS_API.md. Do you agree with it in general? Would you structure the payloads somehow differently?

UPDATE: See https://tree.taiga.io/project/transformap/task/418 for reference

const put = (url, body) => breq.put(url, body).then(_.property('body'))
const delete_ = (url) => breq.delete(url).then(_.property('body'))

const endpoint = CONFIG.server.url() + '/media'
Copy link

@acorbi acorbi Aug 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see the uuid of the POI the media file(s) are assigned to anywhere.

  • Should not there be a reference to it so the uploaded/retrieved media(s) can be related to a certain POI?

As in, if I POST a new media file stored on ipfs ( for example) by POSTing a json file similar to test/fixtures/media-new-ipfs.json to https://data.transformap.co/media/, how does the API know which POI I want to upload it to?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can happen decoupled from the storage of media metadata. It would be up to the user client to update the respective /place/UUID document with an array of UUIDs of associated media.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@almereyda OK, it is very decoupled indeed but sounds good to me sofar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventual consistency ;)

})
})
})
describe('POST image', function () {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see any metadata being POSTed along the BLOB in this case.

  • How is the metadata being uploaded? Should not there be a way to POST it along the BLOB?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple ways to post it along the BLOB. Either the client pushes additional associated metadata, such as a name atrribute, in a second PUT request, which would not need any further modification of the data service, or multiple POST requests are associated to each other by relying on multipart file streams, which requires more investigation and implementation.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@almereyda ok, got it. yeah, a second PUT request sounds easier to implement allthough it could introduce the case where a media file does not have the required metadata (so far only name, correct?)

Copy link
Member Author

@almereyda almereyda Aug 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content disposition header of an HTTP POST upload contains a value filename, which we could sanitizingly infer a name from. Then metadata is detected to a broad extent.

Still, this involves correct handling of multipart.

Asking how to prepare this for compatibility with IIIF from here is left to succeeding iterations on the codebase. I may safely conclude KISS applies for now.

Copy link
Member Author

@almereyda almereyda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes, the /media/ endpoint is intended for a generalised media management of media data types, decoupled from objects of the type place.
  2. Versioning is implemented in a generalised way for all things within T418 by @ponder2342, why the choice fell to separate media and place data from each other. A connection can be reintroduced by the user client via adding additional data to a place object, as formulated in my comments to your code comments.

Where would you document this kind of specification, anything in particular missing?

@thoka always pointed at the fact that there is no design paper about the data service.

Eventually @jum-s, @maxlath and me find some time next week to scribble such a writing?

})
})
})
describe('POST image', function () {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple ways to post it along the BLOB. Either the client pushes additional associated metadata, such as a name atrribute, in a second PUT request, which would not need any further modification of the data service, or multiple POST requests are associated to each other by relying on multipart file streams, which requires more investigation and implementation.

const put = (url, body) => breq.put(url, body).then(_.property('body'))
const delete_ = (url) => breq.delete(url).then(_.property('body'))

const endpoint = CONFIG.server.url() + '/media'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can happen decoupled from the storage of media metadata. It would be up to the user client to update the respective /place/UUID document with an array of UUIDs of associated media.

@almereyda
Copy link
Member Author

This is slowly evolving, also see https://hack.allmende.io/ecobytes-20170621-mms-research?both#17092017

.then(res.json.bind(res))
.catch(error_.Handler(res))
if (req.files) {
lib.upload(req.body, req.files.mediaUpload) // code smell: hard coded input id, configurable, or per type?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@almereyda

Do I understand correctly that:

req.body would be the metadata of the media file and req.files.mediaUpload the binary contents of the actual asset? Are they both POSTed on the same API method call?

On https://hack.allmende.io/transformaps-20170926-development?view#api-calls-involved-on-uploading-a-media-file-and-associating-it-with-a-poi (and as currently implemented) I am envisioning to do this in 2 calls...

If my statement is correct, our implementations are currently dissaligned. Please confirm or elaborate so we can align both implementations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the sequence diagrams I learn you expect to be POSTing a BLOB to a separate /blob endpoint associated to any media/:id route. From this I like the idea to separate the creation of a media metadata container (thing) from its filling. Yet I suggest an upload can be triggered in presence of a multipart stream in any PUT action and thus more tightly couple this to a single medium's endpoint.

There may be the need for additional handling of the side effects, i.e. on other things such as place, where one might want to avoid allowing for binary uploads.

Copy link

@acorbi acorbi Oct 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@almereyda I have been studying your proposal of coupling the upload of the binary contents of the asset together with its metadata in a POST (creation) or PUT (update) method on /media/ and came up with following 'dilemma':

As described on https://hack.allmende.io/transformaps-20170926-development?view#api-calls-involved-on-updating-a-poi-by-uploading-a-new-asset and according to the current implementation, updating a media file does not involve a PUT call to the media endpoint but a POST call to media/{mediaId}/version with the goal of storing a new version object on the DB and not updating the media file's metadata.

I see 2 options at this point to achieve an implementation which would make sense to me:

  1. To leave the mutipart upload of the asset decoupled from the creation (POST) or update of a certain media file, as it is currently the case.

  2. To delegate the responsibility of the creation of new versions of a media file (journaling) to the data service. This means that the editor would not make any POST calls to media/{mediaId}/versions as it is currently the case ( see https://github.com/TransforMap/transformap-editor/pull/42/files#diff-e2815ec3044d7d3fc5c27a5a3ed1b358R115). The editor would just make a PUT call to media/{mediaId} and the data service would then store a new version of it.

@almereyda
Copy link
Member Author

almereyda commented Oct 1, 2017

I am currently reorganising my thoughts around adding to IPFS and updating the media metadata with the mimetype, hash and a url (generated from a URL template).

If we switched from the node-fileupload approach to multer, IPFS support could be implemented and logically separated as a Multer storage engine and benefit a wider community of developers. _handleFile and _removeFile would then be mapped to pin (implied on add) and unpin actions with the IPFS daemon. It also allows for adding a ReadableStream instead of a Buffer, which improves memory handling.

The public and private IPFS daemons may as well be linked to each other by an ipfs-cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants